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Abstract 


The relationship between conformations of bioactive molecules with their pharmacol- 
ogical profiles has been well established. Only the unique biologically active confor- 
mation of a drug molecule can bind to the active site of the receptor. In this thesis, 
conformational analysis of bioactive compounds at various environments will be dis- 
cussed. Two categories of molecules were investigated; cannabinoid (CB) analogues 
and [60]fullerene derivatives. The major structural characteristics of these molecules 
are: (i) amphiphilicity and (ii) existence of flexible and rigid pharmacophoric seg- 
ments. Their flexible segments constitute a challenging field for conformational 
analysis exploring of putative bioactive conformations. 

In case of CBs, a set of novel A s -tetrahydrocannabinol (A S -THC) and cannabidiol 
(CBD) analogues were subjected to three-dimensional quantitative structure-activity 
relationships (3D-QSAR) studies using comparative molecular field analysis 
(CoMFA), and comparative molecular similarity indices analysis (CoMSIA) method- 
ologies. The high active compound C-l'-dithiolane A S -THC analogue AMG3 at the 
data base was selected as template molecule. Using molecular modeling techniques 
such as Monte Carlo (MC), molecular dynamics (MD) and grid scan analysis, the pu- 
tative bioactive conformation of AMG3 in solution was determined. This conformer 
was used as a template, and CB1 and CB2 pharmacophore models were developed. 
The availability of homology models of CB1 and CB2 receptors based on rhodopsin 
has allowed the conformational analysis studies of AMG3 at the binding site of the 
receptor. Derived low energy conformers of AMG3 at the receptor site have been 
compared with its in solution conformations. The steroelectronic properties of binding 
cavities of a receptor model are directly related to the performed molecular model co- 
ordinates. In the presented thesis, a homology modeling study based on [12 -adrenergic 
receptor for both CB1 and CB2 receptors was also performed and results were com- 
pared with rhodopsin based homology models. Similar binding sites of CB1 and CB2 
receptors using rhodopsin based models have been generated using the (32-adrenergic 
based receptors. The QSAR models were re-generated using putative bioactive con- 
formers of AMG3 at the binding site of the CB1 and CB2 receptors. Relative contri- 
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butions of steric/electrostatic fields of the 3D QSAR/CoMFA and CoMSIA pharma- 
cophore models have shown that steric effects govern the bioactivity of the com- 
pounds, but electrostatic interactions also play an important role. The comparison of 
derived QSAR models has shown that increasing the complexity level of calculations 
(mimicking more accurately the biological conditions) was positively affected the ob- 
tained statistical result. The optimal QSAR partial least square (PLS) analysis was 
used as an input in the de novo drug design studies and these simulations provided 
novel CB analogues with enhanced predicted binding affinities. 

In case of fullerene derivatives, a series of experimentally reported as well as compu- 
tationally designed monoadducts and bisadducts of [60]fullerene analogues have been 
used in order to analyze the binding interactions between fullerene based inhibitors 
and human immunodeficiency virus type I aspartic protease (HIV-l PR), employing 
docking studies. MD simulations of ligand-free and the inhibitor-bound HIV-1 PR 
systems complemented the above studies and provided proper input structure of HIV - 
1 PR in docking simulations. The obtained results revealed a different orientation of 
the [1-hairpin flaps at these two systems. In inhibitor bound system, the flaps of the 
enzyme are pulled in toward the bottom of the active site (the closed form) while, in 
ligand-free system flaps shifted away from the dual Asp25 catalytic site and this sys- 
tem adopts a semi-open form. The structural analysis of these systems at catalytic and 
flexible flap regions of the HIV-1 PR through the simulations, assisted in understand- 
ing the structural preferences of these regions, as well as, the adopted orientations of 
fullerene derivatives within the active site of the enzyme. The reported most active 
fullerene analogue in the data base has been used as template and 3D QSAR models 
were derived. Based on obtained contour plots and derived PLS analysis, de novo 
drug design studies were performed in order to propose novel analogues with en- 
hanced binding affinities. Such structures may trigger the interest of medicinal chem- 
ists for synthesizing novel HIV-1 PR inhibitors possessing higher bioactivity, consid- 
ering the urgent need for new anti- HIV drugs. 
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Chapter 1. Introduction 
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1.1 Motivation 


Drug design is an iterative process which begins with identification of a compound 
that displays an interesting biological profile and proceeds until the activity profile is 
characterized by minimum undesirable side effects and chemical synthesis is opti- 
mized. 1 One of the main issues faced currently by the pharmaceutical industry is find- 
ing appropriate ligands for a given target protein and ensuring that they are highly 
specific for that target. Finding a therapeutic compound that binds selectively to a tar- 
get receptor is not an easy task in the laboratory. Thus, this problem can be handled 
by using a combination of computer simulations together with laboratory work. 
Thanks to the wide spread availability of high performance computers and new com- 
putational techniques, receptors and their binding interactions with ligands can be de- 
fined at a molecular level with high accuracy. 

The physicochemical and biological properties of a molecule depend on the confor- 
mations that it can adopt. Conformational analysis is the study of the conformations of 
a molecule and their influence to its chemical, physical or biological properties . 2 The 
modern conformational analysis studies were initiated by D. H. R. Barton 3 , who 
showed that the reactivity of substituted cyclohexanes was influenced by the equato- 
rial or axial nature of the substituents. The development of conformational analysis 
was enhanced due to the advancement of analytical techniques (e.g., infrared (IR) 
spectroscopy, nuclear magnetic resonance (NMR) spectroscopy and X-ray 
crystallography) which allow to explore the conformational properties of a molecule. 

One of the frequently problems faced in drug design is to find the conformation of a 
molecule that adopts when it fits its target binding site (bioactive conformation). The 
relationship between the conformations of bioactive molecules with their pharmacol- 
ogical profiles has been well established. Only a unique conformation of a drug mole- 
cule can bind to the binding site of the receptor. The knowledge of the conformations 
of a ligand at the binding site of a receptor assists in the rational approach to drug de- 
sign. One might suggest that the most stable conformation in solution is likely to be 
the bioactive conformation since the molecule is most likely to be in that conforma- 
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tion. However, the bioactive conformation of a ligand at the active site of the receptor 
is not necessarily identical with the lowest energy conformation, in solution. This is 
attributed to the fact that favourable binding interactions of the molecule with its tar- 
get can lead a complex stabilization when the molecule adopts conformation that ap- 
pears in solution with higher energies. 4 Nonetheless, a very high energy conformation 
that is excluded from the population of conformations in solution cannot be biologi- 
cally active. Moreover, the investigation of the optimal conformation of a bioactive 
compound also plays an important role in three dimensional quantitative structure- 
activity relationships (3D QSAR) studies, because the output from the constructed 3D 
QSAR models is directly related with the alignment of the molecules in data set, 
based on template conformer. 

The aim of applying 3D QSAR studies is to derive indirect binding information from 
the correlation between the biological activity of a training set of molecules and their 
3D structures. The importance of steric and electrostatic characteristics is revealed by 
aligning structurally similar analogues using pharmacophoric features as structural 
superimposition guides. In contrast to the rigid molecules, defining the bioactive con- 
formation of flexible ligands is more complex and therefore the alignment procedure 
of 3D QSAR is one of the most difficult steps. The combination of molecular model- 
ing techniques and 2D NMR spectroscopy can be used to obtain the low energy con- 
formation of a potent ligand in the data set which will serve as a template compound 
in the construction of QSAR models. In order to determine the linear correlation 
coefficients between actual versus calculated binding affinities, statistical analysis of 
the data can be used. The derived 3D QSAR models help to predict binding affinity 
values of ligands prior to their synthesis. The structure and binding affinity relation- 
ships of compounds can be graphically plotted and used to explain different 
steroelectronic requirements of ligands for binding to the receptor site. Contour results 
can be used as pilot models for proposing novel analogues before their synthesis. 

The easiest way to analyze a bioactive conformation is to study the X-ray crystal 
structure of ligand-bound target structure. The structure of the protein/ligand complex 
can then be analyzed using computational methodologies and conformation of the 
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ligand can be identified. However, not all proteins can be easily crystallized (e.g., 
membrane proteins), thus other methodologies have to be used in order to identify the 
active conformation. If the X-ray structure of target is unknown, a homology model 
target can be created by molecular modeling based on known X-ray structure of a pro- 
tein from same family, thus binding sites may be constructed to aid the drug design 
process. If one of the active compounds for a specific target is a rigid molecule which 
has one possible conformation, it can be used as template for more flexible molecules. 
The geometry of the pharmacophore (atoms and functional groups required for a spe- 
cific pharmacological activity, and their relative positions in space) can be determined 
for the rigid molecule, and flexible molecules can be compared with this rigid mole- 
cule in order to find conformation which will place the important binding groups in 
the same relative geometry. 4 If a fully rigid molecule is not available to act as a tem- 
plate, it may be possible to match up different ligands that have a rigid part some- 
where in their skeleton. Energy minimization methods play a crucial role in the con- 
formational analysis. 4 If possible, it is desirable to identify all low energy conforma- 
tions on the energy surface, but the number of minima may be so large that is not 
practical to pick up the conformer that has lowest energy. Under such circumstances 
population analysis can be performed using statistical mechanics. Solvation effects 
may also be important to include in the calculations of intramolecular energy. Several 
computational techniques (e.g., molecular dynamics (MD) simulations, Monte Carlo 
(MC) calculations) can be used as part of a conformational search strategy. More de- 
tailed information on conformational search strategy will be given in the “Theoretical 
Background” chapter of the thesis. 

When no experimental structural information is available, molecular modeling tech- 
niques can be used to reveal the bioactive conformations of the ligand. These tech- 
niques include: (i) geometry optimization calculations, in gas phase; (ii) geometry op- 
timization calculations employing a continuum model which simulates the biological 
medium; (iii) rotational energy barrier calculations; (iv) MC simulations; (v) MD 
simulations; and (vi) 3D QSAR models. Additional molecular modeling techniques 
can be used when the X-ray structure or homology models of a receptor are available: 
More specifically; in silico docking; and MD simulations of ligand at the active site of 
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the receptor can be performed. If the X-ray structure or a homology model of the tar- 
get protein is available, the investigation of a bioactive conformation of molecule be- 
comes more realistic and can be obtained with greater credibility. 

1.2 Investigated Systems 

In the present work, conformational analysis of bioactive compounds at various envi- 
ronments will be discussed. Two categories of molecules were studied: (i) cannabi- 
noid (CB) analogues and (ii) [60]fullerene derivatives. The major structural character- 
istics of these molecules are: (i) amphiphilicity and (ii) existence of flexible and rigid 
pharmacophoric segments. Their flexible segments constitute a challenging field for 
conformational analysis exploring of putative bioactive conformations. 

1.2.1 CBs 

CB agonists have been suggested to have potential therapeutic uses such as neuropro- 
tective, analgesics, apetite stimulants, anti-emetics, anti-glaucoma agents, and for the 
treatment of diseases associated with inappropriate retention of aversive memories 
(e.g., post-traumatic stress disorders and phobias). 5 The pharmacological activity of 
CBs is mediated by G protein coupled receptors (GPCRs) CB1 and CB2. On binding 
of agonists, GPCRs become activated, presumably by conformational changes in the 
transmembrane (TM) domain. The CB1 receptor is localized primarily in the central 
nervous system (CNS), reflecting its most abundant GPCR prevalence in brain. An 
interesting feature of CB 1 is its ability to be activated by structurally different classes 
of molecules, thus increasing the possibility of multiple activated forms of the recep- 
tor. Although detectable at exceedingly low levels in brain, CB2 receptors are ex- 
pressed mainly by immune cells and mediate immune responses, inflammatory and 
neuropatic pain. 6 

Presently known CB analogues show susceptibility towards enzymatic hydrolysis 
and/or not have CB1/CB2 receptor selectivity. There is considerable interest to design 
CB analogues possessing selectivity, high affinity and metabolic stability. Such ana- 
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logues may supply favorable response with fewer undesirable side effects and higher 
metabolic stability. 7 

Identification of the binding conformations of CB agonists within the binding site of 
the receptor is of great interest not only to understand the key binding interactions be- 
tween the amino acid residues and the ligand, but also provide insight into the mo- 
lecular mechanism of receptor activation. 8 

A set of novel synthesized A s -tetrahydrocannabinol (A 8 -THC) and cannabidiol (CBD) 
analogues were subjected to 3D QSAR studies using comparative molecular field 
analysis (CoMFA) 9 , and comparative molecular similarity indices analysis (CoM- 
SIA) 10 methodologies in order to propose new selective and high affinity CB ana- 
logues. The high active compound C-l'-dithiolane A 8 -THC analogue (— )-2- 
(6a,7,10,10a-tetrahydro-6,6,9-trimethylhydroxy-6H-dibenzo[b,d]pyranyl)-2-hexyl- 
l,3dithiolane (AMG3), (Figure l.li) was selected as the template molecule. Using 
combination of several molecular modeling techniques such as molecular mechanics 
(MM) and quantum mechanics (QM) geometry optimization calculations, MD simula- 
tions, MC calculations and grid scan analysis, the putative bioactive conformation of 
AMG3 in solution was determined. This conformer was used as the template structure 
and CB 1 and CB2 pharmacophore models were developed. 

The availability of homology models of CB1 and CB2 receptors based on bovine 
rhodopsin, allowed the conformational analysis studies of AMG3 at the binding sites 
of CB receptors. Firstly, in silico docking simulations were performed using template 
ligand at the CB1 and CB2 receptors and cluster analysis was accompanied to the ob- 
tained binding poses. Secondly, the derived best binding pose of protein/ligand com- 
plex structure has been used as an input coordinate, and MD simulations have been 
performed in the presence of membrane bilayers. Trajectory analysis from MD simu- 
lations of generated snapshots clarified the favored adopted low energy conformers of 
template compound. The derived low energy conformers of AMG3 at the binding site 
of the receptor have been compared with those produced in solution. QSAR models 
were re-generated using putative bioactive conformers of AMG3 at the binding site of 
the CB1 and CB2 receptors. Generated QSAR models in solution and at the binding 
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site of the CB1 and CB2 receptors were compared to define environment that more 
closely resembles the bioactive conformer. 

The derived optimal QSAR partial least square (PLS) analysis of CB models were 
used in the de novo drug discovery program for the predictions of novel compounds 
with enhanced predicted binding affinities. 

Since the steroelectronic properties of binding cavities of a receptor model are directly 
related to the performed molecular model coordinates, a homology modeling study 
based on p2 -adrenergic receptor for both CB1 and CB2 receptors was also performed 
and results were compared with former rhodopsin based homology models. 

1.2.2 [60]Fullerene Derivatives 

In the last few years, many interesting biological applications of [60]fullerene deriva- 
tives have started to be investigated due to their promising biological activities such 
as DNA photocleavage, human immunodeficiency virus type I aspartic protease 
(HIV-l PR) inhibition, apoptosis and neuroprotection. 11 12 Indeed, unique spherical 
shape of [60]fullerene may be envisaged as fitting the hydrophobic cleft that is often 
characterizing the target structures. When the cleft hosts the guest molecule and the 
intermolecular steroelectronic interactions are sufficiently strong, inhibitory effect 
may be expected. 

The inhibition of HIV-1 PR by fullerene analogues has been demonstrated by Fried- 
man et a/. 11 ' 13 and the complexation of HIV-1 PR with fullerene compounds has been 
supported by molecular modeling studies. These studies showed that the fullerene de- 
rivatives can be perfectly accommodated inside the binding pocket of HIV-1 PR. 
However, the binding affinity (Kj) values of “first generation” fullerene inhibitors 
were not significant (Kj ~10" 6 M). Thus, further structural investigation is required in 
order to propose new HIV-1 PR/fullerene complexes with better binding affinity. 
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A series of experimentally reported as well as computationally designed monoadducts 
and bisadducts of [60]fullerene analogues have been used in order to analyze the 
binding interactions between fullerene based inhibitors and HIV-1 PR enzyme em- 
ploying docking studies. MD simulations of ligand-free and the inhibitor-bound HIV- 
1 PR systems complemented the above studies and provided proper input structure of 
HIV-1 PR in docking simulations. The structural analysis of these systems at catalytic 
and flexible flap regions of the HIV-1 PR through the simulation, assisted in under- 
standing the structural preferences of these regions, as well as, the adopted orienta- 
tions of fullerene derivatives within the active site of the enzyme. 

The reported most active [60]fullerene in the data base ((3'-Phenyl-3'-(a- 
hydroxybenzyl)-l,2-cyclopropano]buckminsterfullerene), (Figure l.lii)) was used as 
template and 3D QSAR/CoMFA and CoMSIA models were derived. Based on con- 
tour plots and PLS analysis from the models, de novo drug design studies were per- 
formed in order to propose novel analogues with enhanced binding affinities. Such 
structures may trigger the interest of medicinal chemists for novel HIV-1 PR inhibi- 
tors possessing higher bioactivity. 



(i) 



Figure 1.1 (i) Molecular structures of potent bioactive CB analogue AMG3; and (ii) 
an anti-HIV- 1 PR bioactive fullerene analogue. 
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1.3 The Aim of the Study 


The aim of this study is to shed some light on the following questions addressed in the 
CB and fullerene fields. 

(i) Can 3D QSAR studies help to design putative bioactive novel CB and 
fullerene analogues with higher binding affinities and metabolic stabili- 
ties? 

(ii) Can 3D QSAR studies help to design, in particular, selective novel CB de- 
rivatives for CB1 and CB2 receptors? 

(iii) Is there any connection between biological activity of fullerene derivatives 
and the flap motion of the HIV-1 PR enzyme? 

(iv) How conformational changes at the catalytic site of the HIV-1 PR affect 
the biological activity of the fullerene derivatives? 

(v) How conformations of a template ligand derived using different environ- 
ments affect the 3D QSAR models? 

1.4 Outline of the Thesis Structure 

The thesis structure is organized as follows. Chapter 1 introduces the reader to the 
subject and aims of this Ph.D. dissertation. Chapter 2 provides the “theoretical back- 
ground” of some key concepts necessary to understand the discussed obtained results. 
In the Chapter 3, details of computational techniques used in the calculations are dis- 
cussed. Chapter 4 provides discussion for the used strategies in the rational drug de- 
sign. In the Chapter 5 , a thorough conformational analysis of potent CB ligand AMG3 
using several molecular modeling techniques, its binding interactions with CB1 and 
CB2 receptor models, comparative 3D QSAR studies of CB analogues, homology 
modeling calculations of CB receptors and de novo drug design studies of CB ligands 
are discussed. In the Chapter 6, conformational analysis and binding interactions of 
[60]fullerene derivatives at the binding pocket of HIV-1 PR, 3D QSAR and de novo 
drug design studies are discussed. A summary of the main results is given in the last 
chapter (Chapter 7). Concluding remarks for the impact of this dissertation are also 
highlighted. 
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Chapter 2. Theoretical Background 
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2.1 Some Useful Definitions for Many-body Systems 


Approaches used in computational chemistry can be divided to two broad parts, em- 
pirical and quantum approaches. Empirical approaches (i.e., MM) use simple models 
of harmonic potential, electrostatic interaction, and dispersion forces for basic com- 
parisons of energetics and geometry optimization. MM methods are extremely fast 
and are able to handle very large systems, such as enzymes . 2 Quantum approaches are 
roughly divided into semi-empirical methods and non-empirical (or ab initio ) meth- 
ods. Semi-empirical methods are the approximate methods in which parameters in- 
volved in the equations are taken from experiment, some are neglected, and some oth- 
ers are estimated by fitting to experimental data. Like MM methods, semi-empirical 
methods use experimentally derived parameters; and like ab initio methods, they are 
basically quantum mechanical (QM) in nature. The main difference between semi- 
empirical and ab initio methods is the extensive use of approximations and parameters 
optimized with respect to experimental data by the former approach. This allows 
semi-empirical methods to reduce the computational cost, while the computed results, 
in general, provide useful data . 14 

2.1.1 QM Calculations 

The QM approach postulates the fundamental principles and then uses these postu- 
lates to deduce experimental results. For the definition of the state of a system in QM, 
the function of the coordinates of particles is referred as the wave function or state 
function V F. In general, the state changes with time, thus for one-particle, one- 
dimensional system, v F= v F(x, t). The wave function contains all possible information 
about a system . 14 

Suppose there is a single particle (e.g., an electron of mass to) which is moving 
through space (given by a position vector r - xi + yj + zk) under the influence of an 
external potential d. In order to find the future state of a system from the knowledge 
of its first state, an equation is needed that tells how the wave function changes with 
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time (7). 15 Schrodinger' s time dependent equation describes the particle by a wave 
function 15 T^r, t): 


{- 


tr , d 1 d 2 

( T ^ T + 

2m dx~ dy~ 


| t ) + 5 | '¥(r,t) = ih 
cz~ 


dV(rj) 

dt 


(where, h = h/2;t, h is the Planck’s constant and i 2 = -1) 


When the external potential d is independent of time, then the wave function can be 
written as the product of a spatial part and a time part; V F (r,t) = v|/(r)T(t). In many ap- 
plications of QM, the potential is considered as independent of time, thus the time- 
dependent Schrodinger equation can be written in the more familiar, time-independent 
form 214 ’ 15 : 

fi 2 

{- — V 2 + 9}'¥(r) = EW(r) 

2 m 

d 2 d 2 d 2 

where, E is the energy of the particle and V 2 = — - h h . 

dx dy dz 

Usually, V 2 +i9 is abbreviated as Hamiltonian operator//. Thus, Schrodinger 

2m 

equation is reduced to // V P=£T. In order to solve the Schrodinger equation for many- 
body systems, Hartree-Fock (HF) and density functional theory (DFT) are the com- 
monly used approaches in the QM calculations. In the HF approximation, instead of 
calculating repulsions between electrons in the system explicitly, repulsions are calcu- 
lated between one electron and the average field of all of the other electrons. In the 
DFT approximation, the total electron density is decomposed into one-electron densi- 
ties, which are constructed from one-electron wave functions. 16 ' 17 


2.1.2 Semi-empirical Calculations 

The greatest proportion of the computational time at the ab initio calculation is in- 
variably spent calculating and manipulating integrals. In order to reduce the computa- 
tional effort, the most obvious way is to neglect or approximate some of these inte- 
grals. Semi-empirical methods consider only valance electrons of the system and the 
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core electrons are subsumed into the nuclear core. The key point in semi-empirical 
methods is the overlap matrix S (in Roothaan-Hall equation FC=SCE), which is ap- 
proximated by the identity matrix I. 18 Therefore, all diagonal elements of the overlap 
matrix are equal to one and all off-diagonal elements are zero. Thus, the Roothaan- 
Hall equation FC=SCE becomes FC=CE (F represents the Fock matrix, is a sum of 

one- and two-electron contributions, C is the molecular orbital coefficients and E is 

1 8 

the energy levels). 

2.1.3 MM Calculations: Empirical Force Field Models 

Generally, most of the systems that need to be studied using molecular modeling are 
too large for QM calculations. QM calculations include the electrons in a system, al- 
though some of them may be ignored (e.g., semi-empirical calculations), a large num- 
ber of particles must still be considered, thus the calculations are time-consuming. 2 In 
the MM (also known as force field methods) not the electronic motions but only the 
function of nuclear positions of the system is included in the calculations of the en- 
ergy of the system. 2 Thus, in contrast to ab initio methods, MM is used to compute 
molecular properties (e.g., geometrical properties, relative stability of conformers) 
which do not depend on electronic effects. 2 


Today, many of the MM force fields use relatively simple four components of the in- 
tra- and inter-molecular forces within the system. Energetic penalties are associated 
with the deviation of bond lengths, bond angles and torsion angles from their refer- 
ence or equilibrium values. More sophisticated force fields may include additional 
terms, but they invariable contain these four components 2 ' 16,17,19 : 
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where, i9(r N ) denotes the potential energy, which is a function of the positions (r) of 
N particles (e.g., atoms). The first term represents the interaction between pairs of 
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bonded atoms. This is modeled by a harmonic potential that leads to an increase in 
energy as the bond length (/) deviates from the reference value. The second term is 
associated with bond angle deformations modeled using a harmonic potential. The 
third term represents the torsional potential and models the energy changes relatively 
to bond rotations. The fourth term is associated with the non-bonded atoms and is cal- 
culated between all pairs of atoms (/ and j) that are in different molecules or in the 
same molecule but separated by at least three bonds. This term is usually modeled us- 
ing a Lennard-Jones potential for van der Waals interactions and Coulomb potential 
for electrostatic interactions. 

2.2 Continuum Representations of the Solvent 

Most of the chemical processes take place in a solvent, thus it is important to consider 
how the solvent affects the behaviour of the system. In some cases, the solvent merely 
acts as a ‘bulk medium’ , but it can still significantly affect the solute behaviour, with 
dielectric properties of the solvent often being particularly crucial . 2 In such cases, it 
would be useful not to have explicit solvent molecules in the system, to enable us to 
concentrate on the behaviour of the solute(s). The solvent acts as a perturbation on the 
gas-phase behaviour of the system. This is the purpose of the ‘continuum’ solvent 
models . 2 

2.2.1 Solvation Free Energy 

The solvation free energy (AG S0 /) is the free energy change to transfer a molecule 
from vacuum to solvent phase. The AG so i can be considered to have three compo- 
nents 2 : 

A G so i AG e i ec + AG ir /„. + A G cav 

where, AG e i ec is the electrostatic component, AG W / VV is the van der Waals interaction 
between the solute and solvent. AG rm . is the free energy required to form the solute 
cavity within the solvent. This component comprises the entropic penalty associated 
with the reorganization of the solvent molecules around the solute together with the 
work done against the solvent pressure in creating the cavity . 2 
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2.2.2 Electrostatic Contributions to the Free Energy of Solvation 


Born derived an expression for the electrostatic contribution of the AG S „/ by placing a 
charge within a spherical solvent cavity. 2 In the Born model 20 , AG e i ec of an ion equals 
to the work done to transfer the ion from vacuum to the medium. This in turn is equal 
to the difference in the electrostatic work to charge the ion in two environments. The 
work to charge an ion in medium of dielectric constant s equals to cf!2za, where q and 
a are the charge on the ion and the radius of the cavity, respectively. Thus, AG so / is the 
difference in the work done in charging the ion in the dielectric and in vacuo 2 : 


Q 


1 
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In many cases, solvent effects can be incorporated into a force field model. It is possi- 
ble to study larger systems with the empirical models, in which case it is necessary to 
include dielectric properties of both solute and solvent. The generalized Born equation 
has been widely used to represent the AG e i ec contribution to the AG V „/. 2 The model in- 
cludes a system of particles with radii a, and charge q,. The total electrostatic free en- 
ergy of this system is given by the sum of the Coulomb energy and the Bom free en- 
ergy of solvation in a medium of relative permittivity 2 e: 
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2.3 Computer Simulations Methods 

2.3.1 Energy Minimization Method 

The aim of energy minimization is to find a set of coordinates representing a molecu- 
lar conformation such that the potential energy of the system is at a minimum. This 
can be formally stated as follows: given a function / which depends on one or more 
independent variables xj, x 2 , xj,...x„ find the values of those variables where /has a 
minimum value. 2 ' 21 At the minimum, the first derivative of the function with respect 

to each variable is zero and second derivatives are all positive (^— = 0 ; and 0 L > 0). 

dx t d>Ci 
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I nf ormation provided by energy minimization calculations in some cases can be suffi- 
cient to predict accurately the properties of a system. If low energy conformations of a 
system on an energy surface can be identified then statistical mechanics techniques 
can be used to derive a partition function from which thermodynamic properties can 
be calculated. However, this is possible only for small molecular assemblies in the gas 
phase. Computer simulation methods assist to study large systems and predict their 
properties through the use of techniques that consider small replications of a macro- 
scopic system with manageable numbers of atoms. 2 Simulations generate a time- 
dependent behavior of these small replications in such a way that accurate values of 
structural and thermodynamic properties can be obtained with a feasible computation 
time. 

2.3.2 Molecular Dynamics (MD) Method 

The essence of the MD method is in the numerical integration of Newton’s second 
law relating the mass and acceleration of an atom in the system to the gradient of the 
potential energy function and its associated force field. 19 Thus, an approximate veloc- 
ity for the atom can be computed with the given acceleration for a given period of 
time and changes in atomic coordinates can be determined. MD is a deterministic 
method, that is the state of the system at any future moment can be predicted from its 
current state. 2 For this, continuous nature of potentials requires the equation of motion 
to be integrated by breaking the calculations into a series of very short time steps (St, 
usually in fs time level). Finite difference techniques are used to generate MD trajec- 
tories with continuous potential models. At each time step, the forces on the atoms are 
computed and combined with the current positions and velocities to generate new po- 
sitions and velocities a short time ahead (t+ 8t). 2A 9 There are many algorithms for in- 
tegrating the equations of motion, using finite differences and they assume that the 
positions and dynamic properties of the atoms consisting the system under study can 
be calculated as Taylor series expansions 2 : 

r(f+ 8t)=r(t) + Stv(t) + -St 2 ait ) + -St 3 b{t) + —8t 4 c{t ) + .... 

2 6 24 

v(f+ St)—\it) +5 ta(t) + -Sr b(t) + -St 3 c(t ) + — 8t 4 d{t) + .... 

2 6 24 


24 



where, v is the velocity (the first derivative of the positions with respect to time), a is 
the acceleration (the second derivative), b is the third derivative, and so on. The Ver- 
let algorithm is one of the most widely used algorithm for the integrating the equa- 
tions of motion in MD simulations. It uses the positions and accelerations at time t, 
and the positions from the previous step r(t-St), in order to calculate the new positions, 
r (t+ St). The system is followed for user defined time, taking care that the temperature 
and pressure remain at the required values, and coordinates as a function of time are 
written to an output file at regular intervals. Thus, MD simulation generates trajectory 
files that describe how the dynamic variables change through simulation. Usually, si- 
mulation time in the MD calculations is between hundreds of pico-second (ps) and a 
few nano-second (ns) level. In the last years, MD simulations are extensively used to 
investigate the conformational properties of flexible molecules. 

2.3.3 Monte Carlo (MC) Method 

MC is another computer simulation method, while the low energy conformations of a 
system are connected to the time, in a MD simulation; in a MC simulation, each con- 
formation depends only to the predecessor and not upon any other conformations pre- 
viously visited. 19 The MC technique derives conformations randomly and uses a spe- 
cial set of criteria to decide whether or not to accept each new conformation. These 
criteria ensure that the derived conformation is equal to its Boltzman factor 
exp{-.9(r N )/k B T}, where &( r N ) is calculated using the potential energy function. 2 In 
MC method, each new conformation of the system is generated by randomly rotating 
the bonds. The energy of the new system is then calculated using the potential energy 
function, and if the energy of the new system is lower than the energy of its predeces- 
sor then the new conformation is accepted. If the energy of the new system is higher 
than its predecessor, then the Boltzmann factor of the energy difference is calculated 
with exp{-(i9 BeH ,(r N ) - (9 old (r N Jj/k B T j and compared with a generated random number 

between 0 and l. 2 If the Boltzmann factor is greater than the random number then the 
new conformation is accepted, if not, then initial conformation is retained for the next 
move. 2 Indeed, numbers generated from random number generator are not truly ran- 
dom because the same sequence of numbers should always be generated when the 
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program is run with the same initial conditions. Usually, the linear congruential 
method is used for generating random numbers; each number in the sequences gener- 
ated by taking the previous number, multiplying by a constant (multiplier, a) , adding 
a second constant (increment, b), and taking the remainder when dividing by the third 
constant (modulus, m); ( £[i] = MOD{ (| - l)xa + b),mj).’ If the constants are chosen 
carefully, the method generates all possible numbers between 0 and m-1. The linear 
congruential method generates integral values, which can be converted to real num- 
bers between 0 and 1 by dividing by m. 2 

There are some difficulties applying the MC simulations to flexible molecules. It has 
been found that, even small movements away from an equilibrium bond length, cause 
a large increase in the energy. One of the widely used methods to overcome this prob- 
lem is to freeze out some of the internal degrees of freedom, such as bond lengths and 
bond angles. 2 ' 16 ’ 19 

2.4 Methods for Exploring Conformational Space 

2.4.1 Grid Search 

In the grid search, all of the rotatable bonds are identified in the system and these 
bonds are then systematically rotated through 360° using a fixed increment (during the 
search bond lengths and bond angles remain fixed). 2 Then, every derived conforma- 
tion is subjected to energy minimization in order to obtain low energy conformations. 
For example, in the alanine dipeptide (Figure 2.1) conformational energy surface can 
be obtained by grid search algorithm. If it is assumed that the amide bonds adopt trans 
conformations then only two torsional angles yj and <p of the backbone are flexible. 
Then, the energy is the function of these two variables and can be represented as con- 
tour maps. 2 These contour plots in amino acids is also known as Ramachandran map, 
after G. N. Ramachandran who found that the amino acids were restricted to a limited 
range of conformations. A representative Ramachandran map (for HIV-1 PR enzyme 
which has hundreds of yj and <p torsion angles) is shown in Figure 2.2, where points 
lie on the axes indicating N- and C- terminal residues for each subunit. The most al- 
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lowed region of Ramachandran space is colored blue, and partially allowed regions 
are colored green. 



Figure 2.1 The structure of alanine dipeptide. 



cp ( degree ) 

Figure 2.2 Ramachandran plot of experimentally derived structure of HIV-1 PR en- 
zyme (pdb, 1AID). 

2.4.2 Random Search Methods 

In the random search method, conformational space can be explored by changing ei- 
ther the atomic Cartesian coordinates or the torsion angles of rotatable bonds. At each 
iteration, a random change is performing to the current conformation and the new 
structure is then refined using energy minimization. If this conformation has not been 
obtained previously, it is stored. A new starting conformation is then chosen for the 
next iteration and the cycle starts again. The procedure continues until a given number 
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of iterations have been performed. The systematic search ends when all possible com- 
binations of bond rotations are covered, however in the random search, there is no 
natural endpoint. 2 

2.4.3 Distance Geometry in NMR 

2D-NOESY (Nuclear Overhauser effect spectroscopy), 2D-ROESY (rotating-frame 
Overhauser effect spectroscopy) and 2D-COSY (correlated spectroscopy) are mainly 
used experimental methods in the conformational analysis problems. 2 NOESY gives 
the information about the distance between atoms which are close together in space 
but may be separated by many chemical bonds. The strength of the NOESY signal is 
inversely proportional to the sixth power of the distance, thus it is possible to calcu- 
late approximate values for the distance between the pairs of atoms. ROESY is an ex- 
periment in which homonuclear NOE effects are measured under spin-locked condi- 
tions and usually used when the NOE is small for any internuclear distance or mixing 
time. COSY experiments are used to provide information for the atoms which are co- 
valently separated by one to three bonds. 2 Usually, 2D-NMR spectroscopy and mo- 
lecular modeling calculations are combined for the elucidation of low energy con- 
formers of a molecule. 

2.4.4 Exploring Conformational Space by Simulation Methods 

MC and MD simulations can be performed to explore the conformational space of 
molecules. There are some difficulties in using MC method for flexible molecules, as 
it has been discussed in section 2.3.3. A common strategy for a conformational space 
problem, solved by MD simulations is to perform the simulations at a very high tem- 
perature (in some cases physically unrealistic temperatures). A high-temperature MD 
simulation may be able to overcome high energy barriers, thus explore conformational 
space. 
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2.4.5 Clustering Algorithms 


Many of the conformational analysis algorithms can generate conformations that are 
very similar. In this case, it is desirable to select a representative small set of confor- 
mations with a subsequent analysis (e.g., cluster analysis). Cluster analysis groups 
similar objects, from which the representatives can be picked up. Cluster analysis 
measures the similarity between the pairs of objects. In comparison of conformations 
of a molecule, the root mean square deviation (RMSD) would be an obvious measure 
to use 2 : 


RMSD = 



where, N at0 ms is the number of atoms and d, is the distance between the coordinates of 
atom i in the two structures. Another method is to measure similarities of torsional 
angles between the structures 2 : 

N,o, 

a..- 7 (w ■ — W ) 

u-ij — / j \ rr m.i m,j ' 
m = 1 


where, w m j is the value of torsional angle m in conformation i. N,„, is the total number 
of torsional angles. 


2.5 The Use of Molecular Modeling and Cheminformatics Methods in 
the Design of New Molecules 


2.5.1 Molecular Docking 

Molecular docking or (in silico docking) is a method which predicts the preferred 
conformations of one molecule to a second one when bound to each other to form a 
stable complex. Molecular docking can be considered as a dynamic procedure where 
the correct geometry of a “key” is sought which will open the “lock”. 
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The docking problem involves several degrees of freedom, for example there are six 
translational and rotational degrees of freedom of a molecule relative to other as well 
as the conformational degrees of freedom of each molecule . 2 ' 14 ' 16 Many algorithms 
have been developed to solve the docking problem. These can be grouped according 
to the number of degrees of freedom that they ignore. The simplest algorithms use 
two molecules as rigid bodies, thus explore only six degrees of translational and rota- 
tional freedom (rigid docking). In flexible docking, the conformational degrees of 
freedom need to be taken into consideration. Most of the current docking programs 
treat only ligand as flexible and receptor is considered rigid (flexible docking). Con- 
formational search methods have been incorporated at some stage in the docking cal- 
culations. For example, MC simulation is used for the changing of internal conforma- 
tion of ligand. The binding energy of ligand at the binding site of the receptor is cal- 
culated by MM methods. The ideal docking method would allow flexibility to both 
ligand and receptor to explore their conformational degrees of freedom. Thus, the 
most ‘natural’ way to incorporate flexibility to ligand and receptor is via MD simula- 
tions to the ligand/receptor complex. However, these calculations are computationally 
very demanding and, in practice, MD is used for refinement purposes of docking 
modes. Thus a successful initial docking pose has to be obtained before MD is ap- 
plied. 

2.5.2 Scoring Functions for Molecular Docking 

Most of the docking programs are able to generate a large number of docking poses. 
Some of them have very high energy and clash with protein, therefore are neglected. 
The rest of the solutions are ranked using some scoring functions. Usually, scoring 
functions attempt to approximate the ligand -receptor binding free energy. The free 
energy of binding can be calculated with several simulation techniques but these cal- 
culations are very time-consuming and too slow to be of value in docking calcula- 
tions. More approximate and faster methodologies have to be used. In contrast to the 
free energy perturbation methods, these algorithms compute the free energy of bind- 
ing as an additive equation of various components 2 ' 22 : 
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AGbind AG solvent AG conf~^~ AG//i("^ A Grot AG//,- AG vib 


where, AG JO ; ra , is the contribution of the solvent effects arising from the interactions 
between the solvent and the ligand, protein and intermolecular complex. AG co „/ is 
contribution of conformational changes in the protein and in the ligand (usually, dock- 
ing algorithms assume a rigid receptor, thus AG CO nf depends only on the conforma- 
tional changes of ligand). AG is the free energy, due to ligand-protein interactions. 
AG ro , (entropic contribution) is the free energy loss because of freezing internal rota- 
tions of protein and the ligand. This penalty function is simply calculated by assuming 
three states per rotatable bonds ( trans and gauche±) of equal energy, thus leading to a 
free energy loss of R T ln3 (~0.7 kcal/mol) per rotatable bond. AG,/,- is the penalty 
function for the loss of translational and rotational free energy caused by the associa- 
tion of two-bodies (ligand and receptor) to give a single body (the intermolecular 
complex). AGi,,/, is the free energy of the system due to changes in vibrational modes 
(calculations of this term is very difficult and do not affect significantly the total sum, 
thus, they are usually ignored). There is a considerable discussion for each of the 
terms in above equation in literature, and for some of them, there may be a number of 
different ways for their estimation. However, many of these approaches are not suit- 
able for the docking calculations, because they require a high computational cost. 2 
Despite the simplicity of scoring functions, such functions rate well in comparisons of 
different functional forms. One of the interesting scoring function methods was sug- 
gested by Bohm. In this approach, a simple relationship between the free energy of 
binding and a variety of parameters as well as fast calculation has been considered 2 ' 23 : 

AG Wnd = AG () + AG„ X /(Ar, Aa) + A G wnic £ /(Ar, Aa) + AG lipo I A lipo I +A G rot N, 

H— bonds ionic 

where, AG 0 is a constant term and it is independent from the system. It was inter- 
preted as corresponds to AG,/, term, in the previous equation which shows the overall 
change in translational and rotational free energy. AG n-bonds is the contribution from an 
ideal hydrogen bond and multiplied by a penalty function /(Ar, Aa), which accounts 
for large deviations of a hydrogen bond from ideal geometry (assumed as 1.9 A and 
180°). AG ionic is the contribution from an unperturbed ionic interactions. AGn po is the 
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contribution from lipophilic interactions, which are assumed to be proportional to the 
lipophilic contact surface between ligand and receptor An po . AG rot is the loss of free 
energy due to freezing a rotatable bond in the ligand upon binding, and it is multiplied 
by a number of rotatable bonds in the ligand (N rot ). 

2.5.3 3D QSAR Studies 

2.5.3.1 Structure-Activity Relationships (SAR) 

Molecular properties are coded by molecular structure. Compounds with similar 
structures often tend to have similar pharmacological activity. However, they usually 
exhibit differences in potency, undesirable side effects and in some cases different 
binding affinities. These structurally related differences are commonly referred to as 
SAR. 

SAR can be defined as “the relationship between chemical structure and pharmacologi- 
cal activity for a series of compounds ”. 4 24 Traditional SAR studies are usually carried out 
by making large numbers of analogues of the lead compound and testing them for bio- 
logical activity at a specific target. A SAR study of a lead compound and its analogues 
can be used to determine the segments of the lead compound that arc crucial for its bind- 
ing affinity, that is, its pharmacophore as well as its unwanted side effects. This informa- 
tion can be used subsequently to design a new drug that has increased affinity and fewer 
unwanted side effects than the existing drug (optimize its SAR). Traditional SAR 
investigations are useful tools in the search for new drugs, however they are expensive in 
both personnel and materials. Thus, a number of attempts have been made to improve the 
traditional SAR studies. 

2.5 .3.2 QSAR 

The success of the SAR approach to drug design depends not only on the knowledge 
and experience of the drug design group but also may related with luck. QSAR is an 
attempt to remove the luck factor from drug design by establishing a relationship in 
the form of a mathematical equation between biological activity and the measurable 
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physicochemical parameters of a drug that represents its properties such as lipophilic- 
ity, shape and electron distribution, which have major effects on the activity. 4 ' 24 Quan- 
titative structure-property relationships (QSPR) are also used, particularly when a 
specific property other than biological activity is considered. If an equation is formed, 
then a medicinal chemist could determine from the equation the value of parameter, 
and hence the structures, that would optimize the activity. These predictions allow 
medicinal chemists to make a more informed choice as to what analogues to design 
and synthesize. Obviously, this could considerably cut down the cost of drug devel- 
opment. The relationship between these numerical properties and the activity is de- 
scribed by a general equation; n =f(p), where o is the activity (usually defined as log 
(1/C), where C is the concentration of the compound required to produce standard re- 
sponse in a given time), and p is the molecular descriptor (i.e., structure -derived prop- 
erties of the molecule). These properties that influence the activity of a drug are quite 
diverse, the major ones being lipophilicity, steric effects and electronic effects. The 
parameters commonly used to represent these properties are partition coefficients for 
lipophilicity, Taft steric constants for steric effects, and Hammett o constants for elec- 
tromc effects. ' 

2.5.3.2.1 Partition Coefficients (P) 

A drug has to pass through the biological membranes in order to reach its site of ac- 
tion. Consequently, the relative solubilities of a drug in the aqueous medium and lip- 
ids are crucial in the transport of that drug to its site of action, especially at the aque- 
ous medium/lipid interface. P measures the distribution of a compound between two 
immiscible solvents, thus attempts have been made to correlate the activities of drugs 
with their lipid/water partition coefficients. It is not easy to measure P in situ, so the 
less accurate model systems (e.g., organic solvent/aqueous solution) are used; P = 
[drug in the organic phase] / [drug in the aqueous phase]; (usually, water or a phos- 
phate buffer at the pH of blood (7.4) is the most commonly used aqueous phase and n- 
octanol is the most commonly used organic phase). 4 ' 24 The nature of the relationship 
obtained depends on the range of P values for the used compounds. If this range is 
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small the results may be expressed as a straight line equation having the general 
form 2 : 

log (1/C) = ki.log P + k 2 
where, Iq and k 2 are the constants. 


Over large ranges of P values, the graph of log (1/C) versus log P often has a para- 
bolic form with a maximum value (log P°). Thus, there is an optimum balance be- 
tween the aqueous and lipid solubility for maximum activity. Below P° the drug will 
be reluctant to enter the lipid bilayer, whereas, above P° the drug will be reluctant to 
leave from the lipid bilayer. 4 Therefore, analogues that have P values near the P° are 
likely to be the most active and worth for further investigation. Hansch showed that 
many of these parabolic relationships could be represented reasonably accurately by 
equations of the form" : 

log (1/C) = -k, (log P) 2 + k 2 log P + k 3 

An alternative way to express the Hansch equation is to use a lipophilic substituent 
constant ( 71 ). This is the logarithm of the P of a compound with substituent X relative 
to a parent compound in which the substituent is hydrogen" : 

7i = log (Px/Ph) 


Thus, 


log (1/C) = -kj (log n ) 2 + k 2 log n + k 3 

n can be used as an alternative to P,when dealing with a series of analogues in which 
only the substituents are different. 


2.5.3.2.2 Electronic Parameters 


The distribution of the electrons in a drug molecule will have a considerable influence 
on the distribution and activity of the drug. As a general rule, non-polar and polar 
drugs in their unionized form are usually more readily diffused through membranes 
than polar drugs and drugs in their ionized forms. 4 ' 24 Once the drug reaches the ligand 
binding site at the receptor the distribution of electrons in its structure will determine 
the type of electrostatic bonds between drug and receptor. 4 ' 24 The distribution of elec- 
trons within a molecule depends on electron-donating and electron-withdrawing 
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groups found in the molecule. 24 Louis Hammett observed that the dissociation con- 
stants (K) of aromatic acids are influenced by the electronic properties of the substitu- 
ents on the phenyl ring. For example, benzoic acid is weakly ionized in water 24 : 


COOH 

coo 

rS 


A 





The K values of substituted benzoic acids indicate that electron-withdrawing groups 
(e.g., nitro group) increase the dissociation, while electron-donating groups (e.g., 
ethyl) decrease it. A similar effect exists for other equilibrium such as substituted 
phenyl acetic acids (Table 2.1)" : 


.COOH 

A, 



When plotting the quantity log (K/Ko) for benzoic acids on the x axis, where K and Ko 
are the constants for unsubstituted and substituted compounds, respectively, and cor- 
responding values measured for the same set of substituents in phenylacetic acids on 
the y axis, Hammett obtained a straight line. Because of the association between dis- 
sociation constants and free energies (AG = -2.3RTlogK) this phenomenon is known 

24 

as the linear free energy relationship." 


Benzoic acid 

R 

K 

log (K/Ko) 

H 

6.2 x 10 5 (Ko) 

0 

no 2 

37.1 x 10 5 

0.78 

Et 

4.4 x 10 5 

-0.15 


Phenylacetic acid 

R 

K 

log (K/Ko) 

H 

5.2 x 10 5 (Ko) 

0 

no 2 

14.1 x 10 5 

0.43 

Et 

4.2x 10 5 

-0.09 


Table 2.1 K and log (K/Ko) values for unsubstituted and substituted benzoic acids 
(left) and for same set of unsubstituted and substituted phenylacetic acids. 

Thus, straight line on Figure 2.3 can be written as a linear equation, the Hammett 
equation 24 : 
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log (K/K 0 ) = p log (K/Ko ); 
log (K/K 0 ) = p o 

where, p is related to a given scaffold (e.g., phenylacetic acids), and o is a descriptor 
of a substituent and describes its influence on the K. The parameter p describes the 
magnitude of the effect a substituent can exert on the dissociation reaction of a given 
scaffold. As the distance between the substituent and the dissociated proton increases, 
its influence on the dissociation reaction decreases and so does the value of p. A 
negative or positive o value indicates that substituent is acting as an electron-donor or 
electron-withdrawing groups, respectively. The value of a varies depending on the 
position of the substituted group (i.e., ortho, meta, para) in the molecule. A substitu- 
ent may have an opposite sign of o depending on its position on the ring because o 
includes both inductive and resonance contributions of the electron distribution. For 
example, o m for the methoxy group of the m-methoxybenzoic acid is 0.12, however c p 
( p - methoxybenzoic acid) has the value of -0.27, because in the former case the elec- 
tron distribution is dominated by the inductive contribution, whereas in the latter case 
it is controlled by the resonance effect. 24 



Figure 2.3 Plot of log (K/K 0 ) values of benzoic acids versus log (K/K 0 ) values of 
phenylacetic acids. 

2.5.3.2.3 Steric Parameters 
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The Taft steric parameter (E s ) was the first attempt to show the relationship between a 
measurable parameter related to the shape and size (bulk) of a drug and the dimen- 
sions of the target site and activity of a drug . 24 Taft used the relative rate constants of 
the acid-catalysed hydrolysis of a-substituted methyl ethanoates in order to define his 
steric parameter, because it had been shown that the rates of these hydrolyses were 
almost entirely dependent on steric factors” : 

c xCHtCOOR ) _ , , 1 , 

— “j 7 — ^(XCH 2 COOR) — K(CH } C00R) 

tog £(C// 3 C00«) 

where, k is the rate constant of the appropriate hydrolysis, when X = H, E s = 0. 

E s is an experimental value based on rate constants for a given model reaction. The 
bulkier the substituent, the more negative the E s . 

Hansch postulated that the biological activity of a drug could be related to all or some 
of above stated factors by a simple mathematical relationships based on the general 
formula 4 ' 24 : 

log(l/C) = C| (partition parameter) + C 2 (electronic parameter) + C 3 (steric parameter) + c 4 

where, ci, C 2 , C 3 and c 4 are numerical coefficients obtained by feeding the data into a 
suitable computational statistical package. Parameters in the above equation are also 
known as molecular descriptors or simply descriptors. It must be noted that, several 
parameters different than above parameters (e.g., molecular weight, density, molar 
refractivity, etc.) can be used to correlate biological activity. 

2.5.3.2.4 Deriving a QSAR Equation 

The starting point for deriving a QSAR equation is the study table. It consists of a 
spreadsheet with molecules (data set) across the rows and molecular characteristics 
(biological activity, descriptors) down the columns. Typically, the first column indi- 
cates the molecular identification (e.g., compound number), the second column shows 
the activity of compounds in the data set, and subsequent columns present the values 
of the corresponding descriptors (Table 2.2). The most widely used method for deriv- 
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ing QSAR equations is linear regression, which uses the least-squares fitting to find 
the best combination of coefficients in the equation. The least-squares technique can 
be illustrated by using a simple case where the activity is a function of only one de- 
scriptor. 4 Therefore, the form of equation will be y = mx + c, where, y is the depend- 
ent variable (observations), (e.g., activity) and x is the independent variables (descrip- 
tors), (e.g., log/ 3 ). The objective of a regression analysis is to find the coefficients m 
and c that minimize the sum of deviations of the observed values from the fitted equa- 
tion. 4 ’ 24 

The least-squares coefficients m and c in the linear regression equation are given 

, 2 , 4,24 

by : 

n 

Y J ( x i- < x>)(y i - < y >) 

m = — ; c =< y > —m < x > 

i=l 

where, <x> and <y> are the mean values of the independent and dependent variables, 
respectively. The ‘quality’ of a simple linear regression equation is often reported as 
the squared correlation coefficient r 2 value. This value indicates the fraction of the 
total variation in the dependent variables that is explained by the regression equation. 
In order to determine r value, the total sum of squares (TSS) of the deviations of the 
observed y values from the mean <y> value is calculated together with the explained 
sum of squares (ESS), which is the sum of square deviations of the y values calculated 
from the model, y ca ic.,i from the mean. The y ca ic.,i is obtained by feeding the appropriate 
Xi value into the regression equation. Other common squared term is the residual sum 
of squares (RSS), which is the sum of squares of the differences between the observed 
and calculated y values 2 ’ 3 ’ 24 : 

TSS=£(>>,.-<}>» 2 , ESS = £o„, £J -<>») 2 , RSS = f>,-y c „ CJ ) 2 

1=1 1=1 i=l 

where, TSS is equal to the sum of ESS and RSS. The r is then given by: 

r 2 = (ESS/TSS) = (1 - (RSS/TSS)) 
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The r 2 adopts values between 0 and 1 ; where, 0 indicates that none of the variations in 
the observations is explained by variation of independent variables in the equation and 
1 indicates that all of the variation in the observed values can be explained. 
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Table 2.2 An example of QSAR study table. 

It is not always possible to correlate biological activities with a single descriptor (lin- 
ear model with one descriptor). Given that biological action results from the com- 
bined influence of many factors, one can extend the QSAR model to multiple descrip- 
tors. Indeed, the observation that several parameters used simultaneously can lead to 
good models prompted the development of a method referred to as multiple linear re- 
gression (MLR). 

In MLR, the activity is expressed as a linear combination of descriptors. In most 
cases, the obtained model with the minimum sum of deviations from the observations 
is not perfect and an error ( e ) is usually unavoidable. Thus, 

111 

y = T b J x j + e 

j = i 

where, y is the activity, xj is the value of descriptor j and bj its associated coefficient, e 

i 4 24 

is the error. ’ 

Another quantity that is commonly reported in the QSAR studies is the F statistics. 
The F value is the ratio of the explained mean square divided by the residual mean 
square. The F values are available in statistical tables at different levels of confidence; 
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if the calculated value is greater than this tabulated value, then the equation is signifi- 
cant at that particular confidence level." It must be noted that the F value depends 
upon the number of independent values in the equation and the number of data 
points. 24 As the number of data points increases and/or the number of independent 
variables falls, the value of F which corresponds to a particular confidence level also 
decreases. Because, the aim is to obtain a model to be able to explain a large number 
of data points with an equation containing as few variables as necessary; such an 
equation would be expected to have a greater predictive power. The F-test employs 
the F-distribution to test whether the r 2 obtained from the MLR analysis significantly 
differs from 0. The larger the F-value, the larger the probability that r 2 significantly 
differs from the 0. This is formally taken into account via the number of degrees of 
freedom associated with each parameter. An MLR is associated with N-l degrees of 
freedom, because the fitted line always passes through the means of the dependent 
and independent variables. The TSS is always associated with N-l degrees of free- 
dom. If there are p independent variables (descriptors) in the equation, then there are 
N-p-1 degrees of freedom associated with the RSS and p degrees of freedom associ- 
ated with the ESS. Therefore, the explained mean square equals to ESS divided by p\ 
and the residual mean square equals to RSS divided by N-p-1, and so Fis given by" : 

^ ESSN-p-l 

F= —^r 

One of the easiest ways to validate a QSAR model is to calculate the standard error or 
standard deviation (SE or SD), which is calculated as the average square deviation of 
each number (the residuals) from the mean. The smaller the SD, the higher the quality 
of the model 24 : 

yobs. y calc.') 

n — p — 1 

where, y 0 i, s . and y are the observed and calculated activities, respectively, n is the 
number of data points and p is the number of descriptors used. 

2.5.3.2.5 Cross-validation 
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The indices of r 2 and F can be generated to evaluate the quality of the obtained QSAR 
models. However, these parameters basically only tell us about the ability of the 
QSAR model to reproduce the data from which it was derived and not its aptitude to 
predict the activities of new compounds . 4 ' 24 Two methods can be used to estimate the 
predictive power of a QSAR model. The first method is known as the test set method 
and consists of partitioning the initial data into two sets, a preferred strategy when a 
large data set of compounds is available. The initial data set is divided into two parts; 
the first one (training set) is usually used to build a QSAR model and the second one 
(test set) to validate this model. The second method is known as the cross-validation 
method, it is preferred when the data set is too small. In this method, the data set is 
divided to N equal parts, N-l parts are used to build the model which is then used for 
the remaining N th part to predict the activities of the corresponding molecules. The 
procedure is repeated until all activities of all compounds have been predicted inde- 
pendently. The most common form of the cross-validation method is the leave-one- 
out method, in which each data value is left out and a model derived using the re- 
mainder of the data. A value then can be predicted for the data left out and can be 
compared with the true observed value and this procedure is repeated for every data 

point. This procedure permits the calculation of cross-validated r 2 ( r~ ) or q 2 . 

The q 2 is computed by analogy with r 2 , the difference being the use of the predicted 
sum of squares (PRESS) rather than the RSS in the numerator. PRESS is calculated as 
the difference between the measured activity and the predicted activity for the test 

set 2 ' 4 : 

2 PRESS DOT7CC VV \2 

q = 1 - ; press = 2, ( y pred , - v, ) 

T J (y-<y >) 2 ,=1 

i=i 

2.5.3.2.6 Principal Components Regression (PCR) 

MLR can not deal with data sets, where the variables are highly correlated and/or 
where the number of variables exceeds the number of data points. Two methods are 
commonly used to deal with such situations; PCR and partial least squares (PLS). In 
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PCR, the variables are subjected to principle component analysis (PCA), and then re- 
gression analysis is performed using the first few principle components. PCA is a 
commonly used technique for reducing the dimensionality of a data set (dimensional- 
ity of a data set is the number of variables that are used to describe each objects). 
Consider the data shown in Figure 2.4, there is a high correlation between x and y 
values. If we were to define a new variable, z-x+y, then we could express most of 
the variation in the data as the values of this new variable z, which is called principal 
component. In general, the principal component ( p ) is a linear combination of the 
variables 2 ' 4 : 

V 

Pi=H c Uj x i 

1=1 

where, p, is the principal component and a is the coefficient of the variable xj (there 
are v such variables). The first principle component of the data set corresponds to that 
linear combination of the variables that gives the best fit straight line through the data 
when they are plotted in the v-dimensional space. The second and subsequent princi- 
ple components account for the maximum variance in the data not already accounted 
for the previous principle components. Each principle components corresponds to an 
axis in a v-dimensional space, and each principle component is orthogonal to all the 
other principle components. Indeed, in order to explain all of the variation in the data, 
it is usually necessary to include all the principle components (there can be as many 
principle components as there are dimensions in the data), however, in many cases 
only a few principle components are enough to explain a significant proportion of the 
variation in the data. The principle components are calculated using standard matrix 
methods. The first step is to calculate the variance-covariance matrix; if there are s 
observations (number of compounds), each of them contains v values (number of de- 
scriptors), and then the data set can be represented as a matrix D with v rows and s 
columns. The variance-covariance matrix Z is 2 : 

z = d t d 

The eigenvectors of Z are the coefficients of the principle components. As the Z is a 
square symmetric matrix, its eigenvectors will be orthogonal. 2 The eigenvalues and 
their associated eigenvectors can be obtained by solving the secular equation IZ -All = 
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0. The first principle component corresponds to the highest eigenvalue, the second 
principle component to the second highest eigenvalue, and so on. 2 



Figure 2.4 Most of the variance in this set of highly correlated data values can be ex- 
plained in terms of a new variable z,z-x + y. 

A general rule of thumb is that only those principle components that have eigenvalues 
which are greater than 1 should be used for inclusion in a PCR. 

2.5 .3.2.7 Partial Least Squares (PLS) 

PLS is an alternative technique to PCR. The PLS expresses a dependent variable y in 
terms of linear combination of independent variables 2 xf. 

y = bjti+b 2 t 2 +bst 3 + +b m t m 

where, 

tl = Ci 1 Xi+C 12 X 2 +C 1 3X3+ +C lp x p 

t 2 = C 21 Xl+C 22 X2+C23X3 + +C 2p X p 

t3 = C3lXl+C3 2 X 2 +C33X3 + +C3pX p 

Un — mlX j C m2 X2 Cj!3X3 -\~CtnpXp 

t values are called latent variables (or components) and are constructed in such a way 
that they form an orthogonal set. The PLS method is using as a tool for decreasing the 
variations of many correlated descriptors in a data to a few uncorrelated latent vari- 
ables. The use of orthogonal linear combinations of the x values is very similar with 
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PCA. The major difference is that the latent variables in a PLS are constructed to ex- 
plain not only the variation in the independent variables (x) but also to simultaneously 
explain the variation in the dependent variables (observations, y). 

2.5 .3 .3 3D QSAR 

3D QSAR is a method based on statistical correlation analysis enabling the compari- 
son of 3D molecular forces (fields) produced in the vicinity of different compounds to 
find correlations between biological activities and these fields. 2 QSAR approaches 
aim to establish relationships between biological activities and chemical structure. In 
classical QSAR (2D QSAR) molecular properties are described by parameters that are 
not x,y,z dependent (e.g., logP, MR, E s , o, n, etc.), whereas in 3D QSAR, they are 
represented by a set of values of x, y, z functions, measured at many different loca- 
tions in the space around the molecules. Thus, in 3D QSAR, there are many more de- 
scriptors than 2D QSAR. The comparative molecular field analysis (CoMFA) and the 
comparative molecular similarity indices analysis (CoMSIA) are two most popular 
approaches used in 3D QSAR studies. 

2.5 .3 .3.1 CoMFA 

CoMFA was first described by Cramer et al., in 19 8 8. 25 Prior to the CoMFA analysis, 
each compound in the data set has to be subjected to conformational analysis and it is 
presumed as the active conformation at the binding site of the receptor. These con- 
formers must be superimposed on a selected template compound from the data set. 
The molecular fields (e.g., steric, electrostatic) surrounding each molecule are then 
calculated by placing appropriate probe groups at points on a regular lattice (in order 
to simplify calculations of the fields created around a molecule, the method consists 
of superimposing 3D lattice defining grid points regularly distributed in space, and 
calculating the interaction energy between the molecule and the probe at each grid 
point, using a potential energy function, see Figure 2.5). In order to test the presence 
of a field and to measure, it requires the use of probes with associated energy func- 
tions. 2 Usually a probe atom is employed which is placed at well selected points in the 
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space, to quantitatively measure the value of the field created by the molecule at the 
point considered. The probe must be of same type of the field to be measured (e.g., 
van der Waals probes for steric fields, charged probes for electrostatic fields). The 
electrostatic field is obtained by calculating the electrostatic interaction energy be- 
tween the molecule and a probe at each grid point using Coulomb’s law 2 : 


E = ■ 


1 


■X 




4^o i=i 

where, q, and q p are the atomic charges for the ligand and the probe, respectively. 


The steric field is obtained by calculating the van der Waals interaction energy be- 
tween the molecule and the probe at each grid point using for example a Lennard- 
Jones potential 2 : 


*.=Z 


A 


B 


where, A and B are specific parameters, different for each interacting pair of atoms. 



Figure 2.5 3D lattice defining grid points makes it possible to sample the space with a 
finite number of points of molecular interactions fields. 

2.5 .3 .3.2 CoMSIA 

CoMSIA was developed by Klebe et al., w in 1994 and it is an extension of the 
CoMFA methodology. Both models are forms of QSAR and are based on the assump- 
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tion that changes in binding affinities of ligands are related to changes in molecular 
properties, represented by fields. CoMFA approach has been widely accepted and a 
useful technique in 3D QSAR, however, it is not without problems. In particular, both 
steric and electrostatic potential functions are very steep near the van der Waals sur- 
face of the molecule, causing rapid changes in surface descriptions, and they require 
the use of cut-off values so calculations are not done inside the molecular surface. In 
addition, a scaling factor is applied to the steric field, so both fields can be used in the 
same PLS analysis. Finally, changes in orientation of a set of superimposed mole- 
cules, relative to the calculation grid, can cause significant changes in CoMFA results 
due to steric cut-off values. In CoMSIA, similarity indices are calculated at regularly 
spaced grid points for the pre-aligned molecules. A comparison of the relative shapes 
of CoMFA and CoMSIA fields is shown in Figure 2.6. For the distance dependence 
between the probe atom and the ligand atoms a Gaussian function is used. Because of 
the different shape of Gaussian function, the similarity indices can be calculated at all 
grid points, both inside and outside the molecular surface. The equation used to calcu- 
late similarity indices is as follows 9 : 

K,kU^H W pro be ,k W i^ ariq 

i 

where, A is the similarity index at grid point q, summed over all atoms i of the mole- 
cule j under investigation, w pro be,k is the probe atom with radius 1 A and charge +1, wa 
is the actual value of the physicochemical property k of atom i, r,- 9 is the distance be- 
tween the probe atom at grid point q and atom i of the test molecule, a is the attenua- 
tion factor and its optimal value is normally between 0.2 and 0.3. 26 



Figure 2.6 Shapes of various functions. 
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2.5.4 De novo Drug Design 


In de novo design, the 3D structure of the receptor or the 3D pharmacophore is used 
to design new molecules. There are two basic types of de novo design algorithms; 
‘outside-in’ and ‘inside-out’ methods. In the first class of methods, the binding site is 
first analyzed to determine where specific functional groups might bind tightly. These 
groups are connected together to give molecular skeletons, which are then converted 
to ‘real’ molecules. In the second class of methods, molecules are grown within the 
binding site under the control of an appropriate search algorithm, with each proposi- 
tion being evaluated using an energy function. 2 
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Chapter 3. Computational Details 
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3.1 Minimization Methods 


Energy is a function of the atomic coordinates of the system and software programs 
attempt to generate the xyz coordinates of atoms which correspond to a minimum en- 
ergy. This is accomplished by minimization procedures and these techniques are itera- 
tive in which the atomic coordinates are altered from one iteration to next in order to 
minimize the energy. Commonly used optimization methods are: 2 


(i) The Steepest Descent (SD) Method. The SD method uses a first-order minimization 
algorithm and changes the coordinates of the atoms in the system gradually in order to 
reach closer and closer to minimum energy. In the SD method, r is defined as a vec- 
tor of all coordinates of N atoms in the system. Then, the net force on each atom F 
and the potential energy at each iteration step t are calculated. New positions are cal- 
culated by 2 : 

r t+l = r. + F(n) j= 1,2,3, N. 

The geometry optimization process stops, when the specified predetermined threshold 
conditions by the user are fulfilled. The SD is often used for structures far from the 
minimum as a rough and introductory run followed by a subsequent minimization 
procedure employing a more advanced algorithm (e.g., conjugate gradient). 1 


(ii) The Conjugate Gradients (CG) Method. In the SD method, both gradients and di- 
rection of successive steps are orthogonal, however in the CG method, the gradients at 
each point are orthogonal but the directions are conjugate. New positions are calcu- 
lated by 2 : 

r! +1 = r ,'+ K l f = 1,2,3, N. 

where, hf = F(r‘ +1 ) + y\.h\ 


and, a scalar constant y\ 


F(C).F(r‘ +l ) 

F{rl).F{r!) 


Convergence properties of CG are superior to SD method. 1 ' 19 
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(iii) The Powell Method: Another first-order minimization algorithm is the Powell 
method and belongs to the CG family of optimization method. It applies a more effi- 
cient CG method to determine the descent direction and it is more tolerant to inexact 
line searches. 2 ' 19 

(iv) The Newton-Raphson Method : This method does not use only first derivatives, 
but also the second derivatives in the minimization procedures. Second derivatives 
give information about the curvature of the function. 2 ' 19 

3.1.1 Geometry Optimizations of Investigated Systems 

The structures of the studied molecules were subjected to geometry optimization us- 
ing a combination of the standard Tripos MM force field of the Sybyl molecular mod- 
eling package 27 (Powell energy minimization algorithm, Gasteiger-Huckel charges 
and 0.001 kcal/mol A energy gradient convergence criterion). For the conformational 
analysis of template compounds used in 3D QSAR, ah initio B3LYP/6-31G* level 
QM calculations were also performed. 26 ' 28 " 31 

3.2 MC Simulations 

MC analysis of ligands has been carried out with the CHARMm force field of 
QUANTA package 32 (Powell energy minimization algorithm, Gasteiger-Huckel 
charges and 0.001 kcal/mol A energy gradient convergence criterion) as well as the 
semi-empirical methods of AM 1 and PM3 (SCF convergence criterion has been set to 
10" 6 as energy gradient convergence limit). 26 ’ 29 ’ 30 

3.3 3D QSAR/CoMFA and CoMSIA Settings 

The steric and electrostatic field energies were calculated using the Lennard-Jones 
and the Coulomb potentials, respectively with a 1/r distance-dependent dielectric con- 
stant in all intersections of a regularly spaced (2 A) grid. An sp 3 carbon atom with a 
radius of 1.53 A and a charge of +1.0 was used as a probe to calculate the steric and 
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electrostatic energies between the probe and the molecules using the Tripos force 
field. The truncation for both the steric and electrostatic energies was set to 30 
kcal/mol. 26 " 28,30 

3D QSAR/CoMFA and CoMSIA PLS Analysis and Validations'. The initial PLS analy- 
sis was performed using the “leave-one-out” cross-validation method for all 3D 
QSAR analyses. A minimum column filtering value of 2.00 kcal/mol was set to im- 
prove the signal to noise ratio by omitting those grid points whose energy variation 
was below this threshold. In both CoMFA and CoMSIA analyses, the descriptors 
were treated as independent variables, whereas the pK, values were treated as depend- 
ent variables in the PLS regression analyses in order to derive the 3D QSAR models. 
The final model (non-cross-validated conventional analysis) was developed from the 
model with the highest r c 2 v , and the optimum number of components was set equal to 

that yielding the highest r c 2 v . The non-cross-validated models were assessed by the 

conventional correlation coefficient r 2 , standard error of prediction, and F values. For 
the creation of the CoMFA field, ‘ CoMFA standard’ scaling was selected, while in 
the case of CoMSIA ‘ none ’ option was selected in the Sybyl. 26,28,29,31 

3.4 Molecular Docking 

Flexible docking studies have been applied using FlexX program of Sybyl molecular 
modeling package. 27 FlexX is a flexible docking method that uses an incremental con- 
struction algorithm to place ligands into a binding site of a receptor. 23 The base frag- 
ment (the ligand core) is automatically selected and placed into the active site of the 
receptor using a pose clustering algorithm. 2 Then, the remainder of the ligand is built 
up incrementally from other fragments. Finally, placement of the ligand is scored on 
the basis of protein-ligand interactions and the binding energy is estimated and 
placements are ranked. 

The aim of base placement is to allow favorable simultaneous interactions between 
ligand and protein by pose clustering method. 2 In this algorithm, the base fragment 
and receptor site are represented as a finite set of interaction points. A unique trans- 
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formation of base fragment into binding site is defined by superposing three interac- 
tion sites from the base fragment into three interaction sites on the receptor. Thus, the 
first step of the algorithm is to find all such compatible triangles . 2 All placements are 
clustered according to RMSD between sets of two placements, using the complete- 
linkage hierarchical cluster algorithm . 2 Basically, the two nearest clusters are merged 
into one, as long as the distance between them is lower than the predefined threshold. 
Conformational flexibility is taken into consideration by systematically generating 
low energy conformations of the ligand. Bond lengths and bond angles are held fixed 
at their input values and a set of up to twelve preferred dihedral angles is assigned ac- 
cording to records from a molecular fragment database. A grid box technique is used 
in the interaction energy calculations. Basically, the system is put inside a cubic grid, 
and to check a solution for steric overlap between receptor and ligand, only those re- 
ceptor atoms in cubes within the van der Waals radius of the ligand atom and the larg- 
est receptor atom radius must be checked. For each ligand-receptor interaction, the 
interaction center on the ligand (/,), the interaction center on the receptor (r,) and en- 
ergy contribution of the interaction (w;), provides optimal geometry. For the superim- 
position routing, /, are fitted onto r„ thus 'Zw i (l i - r ) 2 is minimized. In order to guide 

the growth of the ligand, a numerical measure of “goodness of fit” must be deter- 
mined in order to measure the strength of protein- ligand interactions . 2 

The physicochemical model behind FlexX can be divided into three parts: the analysis 
of the conformational space of the ligand, the model of protein-ligand interactions, 
and the scoring function. The scoring function of FlexX, developed by Bohm in order 
to rank the solutions, is an estimation of the free binding energy AG of the protein- 
ligand complex 23 : 

AG bind= AGo + AG H-bonds ^ / (A r, A a) + AG, 0 „, C ^ /(Ar, Aa) + AG arom i /(Ar, Aa) + 

H -bonds ionic arom 

AGupo^^l A Upo I + AG ro f N r ot 

lipo 

/(Ar, Aa) = /l(Ar)/2(Aa) 
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1 A r < 0.2 A 

/l(Ar) = < 1 — (Ar - 0.2) / 0.4 Ar < 0.6 A 

0 A r > 0.6 A 

1 Aa < 30° 

/2(Aa) = j 1 - (Aa - 30) / 50 Aa<80° 

0 Aa > 80° 

where; /(A r, A a) is a scaling function that penalizes deviations from ideal geome- 
try. AGo is a contribution to the binding energy that does not directly depend on any 
specific interactions with the protein. It may be considered as a reduction of binding 
energy due to overall loss of translational and rotational entropy of ligand. AG H-bonds 
represents the contribution from an ideal hydrogen bond. AG, omc describes the contri- 
bution from an unperturbed ionic interaction. AG a rom accounts for the interactions of 
aromatic groups and AG/, po represents the contribution from lipophilic interactions. 
AG rot describes the loss of binding energy due to freezing of internal degrees of free- 
dom in the ligand. N rot is the number of rotatable bonds that are immobilized in the 
complex. It is assumed that such interactions are proportional to An po , the lipophilic 
contact surface between the protein and the ligand. 

(i) Molecular docking studies for CB ligands at the active site ofCBl and CB2 recep- 
tors 30,3 3D models of the CB1 and CB2 receptors based on template rhodopsin from 
Tuccinardi et al. 33 has been used for the initial docking experiments, however, the 
critical binding site residues were determined from several molecular modeling stud- 
ies of CB receptors (e.g., Tuccinardi et al 33 , Salo et al. 34 and Shim et al. 8 ). The active 
site in the docking runs included all atoms within a radius of 6.5 A around the critical 
amino acids for CB1 receptor: Phel74, Leul90, Lysl92, Leul93, Glyl95, Vall96, 
Thrl97, Phe200, Thr201, Pro251, Trp356, Leu359, Ser383, Cys386, Leu387 and for 
CB2 receptor: Leul08, Seri 12, Prol68, Leul69, Trpl94, Trp258. (The complete list 
of amino acid residues form the binding site is detailed in the Appendix). In addition 
docking simulations were repeated with 3D models of the CB1 and CB2 receptors 
produced using p2-adrenergic template receptor. 

(ii) Molecular docking studies for fuller ene analogues at the binding cavity of HIV-1 

28 29 

PRr ’ : Since the X-ray structure of the HIV-1 PR complexed with fullerene based 
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inhibitors has not yet been reported, the initial structure was taken from the HIV-1 PR 
complexed with haloperidol derivative at 2.2 A resolution (pdb code; 1AID). 35 The 
water molecules and the inhibitor were then removed and all hydrogens were added to 
the system. Ionization states for ionizable amino acid residues were assigned accord- 
ing to the standard pK a values of amino acids. The geometry of the enzyme has been 
optimized by using the Tripos MM force field. The maximum number of conformers 
for each molecule was set to 30 and the top 10 lowest energy conformers were used in 
docking simulations and only the best docked complex was considered for further 
analysis. The default FlexX scoring function was used for the simulations. FlexX uses 
formal charges, which were turned on during the docking. The active site in the dock- 
ing runs, included all atoms within a radius of 6.5 A around the critical amino acids: 
Asp25, Asp25', Ile50, and Ile50'. (Full list of binding site residues have been detailed 
in the Appendix). 

3.5 MD Simulations 

The MD simulations were performed with Groningen Machine for Chemical Simula- 
tions (GROMACS) version 3.3.1 software package 36 using the gmx force field. 36 
Simulations were run with periodic boundary conditions. In a truly macroscopic sys- 
tem, surface effects may not be so crucial in the properties of the system, because as a 
spherical system increases in size, its volume grows as the cube of the radius, while 
its surface grows as the square. However, in a typical simulation, computational re- 
sources inevitably constrain the size of the system to be so small that surface effects 
may play an important role in the properties of the system under study. Thus, model- 
ing of a cluster may not tell one much about the behaviour of a macroscopic system. 
In order to eliminate surface effects from the computation, periodic boundary condi- 
tions can be used. 2 Periodic boundary condition are represented in Figure 3.1. Under 
periodic boundary conditions, the system which is modeled is assumed to be a unit 
cell in some ideal crystal (e.g., cube, hexagonal prism, orthorhombic). Periodic boun- 
dary conditions combined with the minimum image convention: only one, the nearest, 
image of each particle is considered for short-range non-bonded interaction terms. For 
the long-range electrostatic interactions this is not always accurate enough, and has to 
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and has to be incorporated with lattice sum methods (e.g., Ewald sum, particle mesh 
Ewald). In these methods, a particle interacts with all other particles in the simulation 
box and with all of their images in an infinite array of periodic cells. 



Figure 3.1 Periodic boundary conditions. 

Because of heating due to external or frictional forces, it is necessary to control the 
temperature of the system. In MD simulations, Berendsen thermostat were used . 37 
This algorithm mimics weak coupling with first-order kinetics to an external heat bath 
with given reference temperature To. Deviation of the system temperature from To is 
slowly corrected by 38 : 

dT T 0 -T 
At T 

where, r is the temperature coupling which is given by: 

C T 

r = 2 WIzl 

N df k 

where, C v is the total heat capacity of the system, k is Boltzmann constant, and N,if is 
the number of total degrees of freedom, x T is the user defined temperature coupling 
time constant and close to r but not exactly equal because the kinetic energy change 
caused by scaling the velocities is partly redistributed between kinetic and potential 
energy, thus change in temperature is less than the scaling energy. Similarly with the 
temperature coupling, the system can also be coupled to a pressure bath. For the MD 
simulations of the systems, Berendsen barostat was applied . 37 The Berendsen algo- 
rithm rescales the coordinates and box vectors with a scaling matrix /j in every step, 
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which has the effect of a first-order kinetic relaxation of the pressure with given refer- 
ence pressure P o : 

d P P 0 -P 

d t t p 

The scaling matrix p is given by 38 : 

Pa = 8 a ~ Pij { P 0ij ~ Pj ( 0 } 

iTp 

where, [i is the isothermal compressibility of the system. 

All bonds were constrained using the linear constraint solver (LINCS) algorithm 39 . 
LINCS is an algorithm that resets bonds to their correct lengths after an unconstrained 
update. Visualization of the dynamics trajectories was performed with the visual mo- 
lecular dynamics (VMD) software package 40 . 

3.5.1 Details of the Performed Simulations 

3.5.1.1 MD Simulations of CB 

(i) In DPPC lipid bilayers (without receptor ) MD simulations: The coordinates of 
AMG3 conformers were used as input at the PRODRG program 41 in order to obtain 
topologies which will be used in the MD simulations. DPPC lipid bilayer for the MD 
simulations was obtained from Dr. M. Karttunen (128 DPPC lipids and 3655 water 
molecules after 100 ns). 42 ' 43 Simulations were run in the NPT ensemble at 300 K and 
1 bar with periodic boundary conditions. Berendsen barostat and thermostat algo- 
rithms were used. 37 Electrostatic interactions were calculated using the particle mesh 
Ewald method. 44 Cut-off distances for the calculation of Coulomb and van der Waals 
interactions were 1.0 and 1.4 nm, respectively. Prior to the dynamics simulation, en- 
ergy minimization calculations were applied to the full system without constraints us- 
ing the SD method for 5000 steps with the initial step size of 0.01 A (the minimiza- 
tion tolerance was set to 100 kJ/(mol.nm)). The system was then equilibrated via 250 
ps simulations with a time step of 2 fs, subsequently a 2.5 ns simulations were per- 
formed at 300 K and 1 bar with a time step of 2 fs. All bonds were constrained using 
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the LINCS algorithm. 39 Origin 6.0 program was used for dihedral angles versus time 
plots and statistical calculations. 45 ( ii ) Membrane associated receptor MD simula- 
tions: The lipid used in solution calculations was employed here, however the lipid 
extended by 4x4x1 in xyz directions, respectively in order to have enough area of 
lipid for the protein merging. Same running parameters with DPPC lipid bilayers 
( without receptor ) MD simulations were used. (Hi) MD simulations at the homology 
modeling step of the receptors: Same parameters with above defined MD have been 
used. The convergence criteria have been tested by the potential energy versus time 
plot and showed sufficient convergence. Each receptor backbone structure from a tra- 
jectory is compared with reference (initial) structure in order to obtain the receptor 
backbone RMSD versus time plots (Figures A1 and A2, Appendix). 

3.5.1.2 MD Simulations of HIV-1 PR and Fullerene Analogues 

Canonical NVT ensemble at 300 K was used with periodic boundary conditions, and 
the temperature was kept constant by the Berendsen thermostat. 37 Electrostatic inter- 
actions were calculated using the particle mesh Ewald method. 44 Cut-off distances for 
the calculation of Coulomb and van der Waals interactions were 1.0 and 1.4 nm, re- 
spectively. Prior to the dynamics simulation, energy of the full system has been opti- 
mized without constraints using the SD integrator for 5000 steps. The system was 
then equilibrated via a 100 ps MD simulations at 300 K. Finally, a 2 ns simulation 
was performed with a time step of 2 fs. The convergence criteria have been tested by 
the potential energy versus time plot and showed sufficient convergence. The receptor 
backbone RMSD versus time plot has also been presented (Figure A3, Appendix). 

3.6 De Novo Drug Design 

FeapFrog algorithm under Sybyl was used in order to automatically generate a series 
of ligands for the binding pocket of a receptor. 27 Feapfrog is a second generation de 
novo drug discovery program for the design of potentially active compounds, using 
molecular evolution or electronic screening, by repeatedly making structural changes 
and then either keeping or discarding the obtained results, depending on the binding 
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energy results. There are two starting input options in order to generate site point 
probe atoms that will be used in the binding energy calculations, these are a pharma- 
cophore model or a receptor structure. The charge of the site point probe atom is posi- 
tive, negative or lipophilic and its value is compared with ±1.0: If the value is smaller 
than +1.0, it is lipophilic, if the value is bigger than +1.0, site points seek a negative 
atom and if the value is less than -1.0, site points seek a positive atom in the fragment. 
Binding energy calculations in LeapFrog were performed by steric, electrosteric and 
hydrogen bonding enthalpies of ligand cavity binding using the Tripos force field un- 
der Sybyl molecular modeling package (v. 6. 8). 27 As with many de novo drug design 
programs, central operation of LeapFrog is the processing loop. In each pass through 
the loop, a type of move is selected randomly. If the move succeeds, a new structure 
is evaluated. A new ligand which passes evaluation is added to the pool of ligands 
available for the next move. The fragments used in JOIN, FUSE and BRIDGE are 
stored as a molecular data base in Sybyl. A hydrogen atom is chosen within the se- 
lected ligand, randomly; and a local energy check is performed on its cavity environ- 
ment within a 3.0 A radius. If steric interactions are not favorable over more than half 
of the enviromnent volume, the hydrogen atom is sterically excluded. If the first cho- 
sen hydrogen is not accessible, another one is chosen, randomly until an accessible 
one is found. If no accessible hydrogens are found, the JOIN move fails. The FUSE 
move process is similar with JOIN; environment checks for steric accessibility are 
performed as JOIN move, however in a FUSE attempt, existence of a ring bond 
flanked by hydrogen in both ligand and fragment are required. Thus, FUSE move 
aims to fuse (usually rings) starting ligand and fragment from data base. The BRIDGE 
move attempts to bridge available fragments. 46 " 49 

LeapFrog studies for CBs. As an initial basic procedure of LeapFrog, site point probe 
atoms were generated using the receptor cavity as well as a pharmacophore model 
inferred from the PLS results options. Template compound AMG3 was selected as 
starting structure. Firstly, the OPTIMIZE module was used for the improvement in 
binding energy. Secondly, several moves such as JOIN, FUSE, BRIDGE and 
OPTIMIZE options were used after the initial run of 100 moves taken into account the 
synthetic difficulties. The derived ligands that had the best binding energy were used 
for repeating the cycle of 5000 moves. 
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LecipFrog studies for fullerenes. Same parameters used in CBs have been applied for 
the fullerenes. Monoadduct and bisadduct of [60]fullerene derivatives that have 
higher affinity with HIV-1 PR (molecules 1, 3, 23, and 42, Tables 6.1 and 6.2 in 
Chapter 6) were used as starting structures with restricting to only changes of pendant 
groups on the [60]fullerenes. 
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Chapter 4. Strategies in Computational Drug Design 
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In-silico drug design has supported pharmaceutical research for over three decades. 
The receptor-based and ligand-based approaches are two major drug design method- 
ologies. 4 ' 24 These strategies, in particular in combination with fragment-based lead 
discovery (identification of bioactive compounds by assembling small-molecule 
fragments) are considered to be complementary to high-throughput screening. A pre- 
requisite for ligand-based approach is the availability of at least one bioactive refer- 
ence structure. 4 ' 24 Therefore, in ligand-based design, attempts are done to derive other 
compounds having a similar shape and electrostatic potential with the known bioac- 
tive compound in order to improve the biological activity. When an X-ray structure or 
3D homology model of a receptor is known, then receptor-based approaches can be 
used to screen compound collections virtually. Usually, both ligand-based and recep- 
tor-based techniques may be applied simultaneously. Pseudoreceptor modeling fills 
the gap between these two approaches. 50 3D QSAR techniques that map molecular 
features onto a field surrounding the spatial alignment of reference conformers are 
used in pseudoreceptor generation. The pseudoreceptor models attempt to capture the 
shape of the ligand binding pocket. 50 The 3D information of receptor (target) is not 
always available at the beginning of a drug design project. For example, it is very dif- 
ficult to obtain the X-ray structure of GPCRs or other membrane -bound proteins. Cur- 
rently, protein data bank (PDB) contains only about 30 entrie (out of 50,000 total en- 
tries) related to GPCR (rhodopsin and p2-adrenergic receptor structures). Thus these 
projects are limited to ligand-based design methods or structural information may be 
obtained from homology modeling studies based on available X-ray structure from 
same family. 

Computational design of novel lead drugs is getting more and more popular and tends 
to substitute the classical approach. 51 " 53 There are many reasons that contributed to the 
preference of in silico design compared to the classical approach 53 : (i) The advance- 
ment of the computer science which leads to the construction of powerful and friendly 
used computers; (ii) the development of statistical packages that can utilize databases 
containing theoretical or experimental data which can be subjected to QSAR; (iii) the 
development of new techniques in the experimental procedures for characterizing pro- 
teins and biological targets (i.e. X-ray crystallography and NMR spectroscopy); (iv) 
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the increase in the knowledge of the molecular basis of drug action. Thanks to the ad- 
vancements in enzymology, molecular biology, pharmacokinetics and pharmacody- 
namics; to mention only few representative fields that experienced huge developing 
steps; (v) the application of conformational search methodologies (e.g., MD and MC 
simulations); (vi) Adsorption, Distribution, Metabolism, Excretion and Toxicity 
(ADMET) simulations have also contributed significantly in the drug development. 

The most important characteristic of the rational drug design is to utilize in a positive 
way all known information of the system under study for developing a strategy for 
potential leads in drug discovery. This knowledge hopefully will lead to reduced hu- 
man power cost, time saving and laboratory expenses. For the drug development new 
experimental and theoretical background is necessary. Usually, the first unsuccessful 
attempts serve as a feed back for development of new drugs. 

Each specific system under study requires its own strategy in the computational drug 
design. Some strategies used in the current research activities for tackling problems 
with different information in the system will be given. In first system, there is no in- 
formation available for the receptor binding site (system-A). Thus, no crystallographic 
data for the ligand bound to the receptor binding site is available in this system. In the 
second case, the X-ray structure of the receptor is not available, but homology models 
of the receptors have been performed using the available X-ray of a prototype recep- 
tor (system-B). Another case is when the X-ray structure of the receptor or ligand- 
bound receptor, associated with experimental ligand binding measurement tests, are 
available (system-C and system-D, respectively). 

In order to develop more active novel compounds for system-A, strategy which is 
shown in Figure 4.1 can be used. Steroidal mustard esters for the antileukemic activity 
can be given as an example for the system-A. 

CBs can be given as an example of system-B. Because of the absence of X-ray struc- 
ture of CB receptors, led to their modeling based on the rhodopsin as well as p2- 
adrenergic receptors that constitute also members of GPCRs. The strategy followed 
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for designing novel leads was based on application of: (i) biophysical studies (ii) theo- 
retical calculations combining with 2D NMR spectroscopy; (iii) in silico docking si- 
mulations; (iv) MD simulations at different environments; (v) 3D QSAR studies; 
(Figure 4.2). 


Anti HIV-1 PR fullerene analogues design can be given as example for the system-C. 
In this case, the X-ray structure of the non-fullerene ligand bound HIV-1 PR receptor 
is known and biological ligand binding data for some analogues possessing different 
biological activity are available. The strategy applied in this case is shown in Figure 
4.3. 


de novo Drug Design 
(i.e.. Leapfrog) 


CONFORMATIONAL 
ANALYSIS (NMR, 
molecular modeling, etc) 




3D QSAR 


V 7 


Synthesis of Novel 
Compounds 


■3 

In vivo biological 
activity tests 



Figure 4.1 Strategy used to develop more potent bioactive molecules for molecules 
acting on a system-A type. 
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Figure 4.2 Strategy used for designing new leads in the system-B. 
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Figure 4.3 Strategy used for designing new leads in the system-C. 


64 


Renin inhibitors can be given as example for the system-D. The target of these com- 
pounds is the inhibition of the action of renin, an aspartic protease that evolves in the 
elevation of blood pressure through the renin angiotensin system. In this case, the X- 
ray structure of the renin with a variety of renin inhibitors is known. For this kind of 
systems, strategy shown in Figure 4.4 can be applied. 


Determination of the 
enzyme binding site 
using X-ray data 


I 



Figure 4,4 Strategy used for designing new leads in the system-D. 


In the current thesis, examples of systems B and C have been studied extensively. 


Given the atomic resolution structure of a target molecule, it should be possible to 
find novel molecules that bind to it, modulating its activity. Indeed, this is one of the 
main aims behind all receptor-based ligand design. Molecular docking is a key tool in 
the target-based approach. The promise of the docking method is that the structure of 
the receptor will provide a template for the design of novel ligands. The need to ac- 
count for the dynamic behaviour of a target has long been recognized as a complicat- 
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ing factor in computational drug design field. Usually a single rigid receptor structure 
was chosen in order to reduce the computational cost. For example, if a large database 
of compounds is to be screened for the binding affinity against a target, then several 
conformers of each compound will be compared with each protein configuration. Al- 
though it is more accurate to use the above mentioned way, it is usually impractical, 
because of very high computational cost. In order to gain flexibility to receptor and 
ligand, one of the easiest ways is to combine MD simulations with docking calcula- 
tions. 


As it is mentioned, conformational analysis is a crucial step in computational drug 
design strategy. In the current thesis, the conformational analysis shown in Figure 4.5 
was followed. 


In Vacuum 


Free-ligand 

Geometry Optimization 
Calculations 

Monte-Carlo Simulations 
Rotational Barrier Calculations 


Conformational Analysis Strategy 


At binding site 
of the receptor 

Docking 


At membrane bilayer 


MD Simulations 


Free-ligand * T binding site 
of the receptor 


Trajectory Analysis 


Figure 4.5 Performed conformational analysis strategy. 
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Chapter 5. Conformational Analysis, 3D QSAR, Ho- 
mology Modeling, Molecular Docking and MD Simu- 
lations of CBs 
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5.1 Introduction 


Cannabis sativa L. is one of the oldest known medicinal plants and has been exten- 
sively used with respect to its psychotropic and pharmacological effects. A 9 - 
tetrahydrocannabinol (A 9 -THC), (Figure 5.1a) is the primary psychoactive constituent 
of cannabis and it was identified by Gaoni and Mechoulam in 1964. 54 

The pharmacological activity of CB ligands is mediated by two CB receptors: CB1 55 " 
58 and CB2. 59 Both CB1 and CB2 receptors belong to the Class A, membrane-bound 
rhodopsin-like family of GPCR, possessing seven characteristic TM domains. 4 " 60 ’ 61 
The CB1 receptor is abundant especially in the central nervous system (CNS) 62 ' 63 , pe- 
ripheral tissues 64 and is assumed to be involved in the regulation of cognition, mem- 
ory, motor activity, and the inhibition of transmitter release through its coupling to 
Ca 2+ and K + channels. The CB2 receptor, on the other hand, is exclusively present in 
the tonsils and cells of immune system 65 ' 66 such as B lymphocytes and macrophages. 
It is also found in the marginal zone of the spleen. The CB2 receptor is assumed to 
participate in the regulation of immune responses and inflammatory reactions and 
neuropatic pain. Pharmacological studies have shown that CBs possess many poten- 
tial therapeutic applications including against cancer, AIDS, stroke, pain, obesity, 
cachexia and neuronal disorders such as multiple sclerosis, Huntington’s chorea and 
Parkinson’s disease, as well as reduction of blood ocular pressure in glaucomic pa- 
tients. 5 ’ 6 ’ 67 " 74 

The recent studies showed that CB receptors work to block pain with a mechanism 
similar to the one which opiate receptors use when activated by the powerful 
painkilling drug morphine. 74 The CBs might provide an alternative to morphine, 
which can have serious side effects such as dependency, nausea and vomiting. 74 

CBs can be classified mainly into three categories: Natural (herbal) or classical CBs, 
endogenous CBs, and synthetic CBs. Natural CBs occur in the cannabis plant. THC, 
CBD (Figure 5.1b) and cannabinol (Figure 5.1c) are examples of natural CBs. En- 
dogenous CBs are produced in the bodies of human and animals. An endogenous CB 
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ligand isolated from porcine brain by Mechoulam and Devane in 1992 75 and is identi- 
fied as arachidonylethanolamide (anandamide), (Figure 5. Id). It binds to the CB1 and 
CB2 receptors with modest (K; = 61 nM), and with low (K, = 1930 nM) affinities, re- 
spectively. 75 It behaves as a partial agonist in the biochemical and pharmacological 
tests used to characterize CB activity. 67 ' 76 2-Arachidonyl glycerol (2-AG), (Figure 
5. 1 e) is the second identified endogenous CB, and was isolated from brain and intes- 
tinal tissues by Sugiura and Mechoulam groups. ’ It was found to bind weakly to 
both of CB1 (K; = 472 nM) and CB2 (K, = 1400 nM) receptors. Prior to the discovery 
of CB receptors, a number of independent research laboratories and pharmaceutical 
industries developed a large number of synthetic CB ligands as pharmacological and 
biochemical probes for studying CB biology and also prototypes for developing new 
medications. 4 ' 79 ' 80 HU-210, CP55,940 and Nabilone are examples of such synthetic 
CB analogues (Figures 5. lf-h). 81 The discovery of endogenous ligands prompted fur- 
ther studies aimed at the elucidation of the chemical and pharmacological behavior of 
the CB 1 and CB2 receptors and cannabinomimetic ligands. These studies pointed out 
that, in addition to classical CBs, other structurally different molecules may interact 
with the same receptors, inducing analogues responses. 68 ' 82 Classical CBs possess two 
pharmacophores within the CB prototype that are important for cannabimimetic activ- 
ity: a phenolic hydroxyl and a lipophilic side chain. The early SAR studies have been 
reviewed comprehensively by Thakur 67 , by Khanolkar 6 , by Razdan 83 , and by Makri- 
yannis and Rapaka 84 . Earlier literature reports 5 ’ 6 ’ 67 ’ 84 ’ 85 showed that the lipophilic alkyl 
side chain plays a crucial role in determining cannabimimetic activity and selectivity 
towards CB receptors, as well as pharmacological potency. The alkyl side chain fits 
into a hydrophobic pocket such that the chain is oriented nearly perpendicular to the 
aromatic ring A. 5 ’ 86 " 89 Analogues with alkyl side chains of less than five carbons have 
limited affinity for the CB1 receptor. 6 67 Extension of the five carbon chain by adding 
one or two carbons improves binding, while further extension is detrimental to bind- 
ing due to steric hindrance. 5 ’ 6 ’ 67 Structural variations within this pharmacophore can 
result in analogues varying by up to three orders of magnitude in binding affinity for 
the CB receptor and in pharmacological potency. The structural modifications of the 
side chain produce high affinity ligands with either antagonist, partial agonist, or full 
agonist effects. 5 
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Figure 5.1 Chemical structures of (a) A 9 -THC, (b) CBD, (c) Cannabinol, (d) Anan- 
damide, (e) 2- AG, (f) HU-210, (g) CP55,940, (h) Nabilone, (i) A 8 -THC. 


In order to improve the medicinal properties and eliminate or reduce untoward effects, 
medicinal chemists are designing, synthesizing and testing additional CB1 and CB2 
ligands. One of main effort of our laboratory is to explore the pharmacophoric re- 
quirements of the alkyl side chain within the classical A s -tetrahydrocannabinol (A 8 - 
THC), (Figure 5. li) and CBD (Figure 5.1.b) templates. A 8 -THC has a very similar 
pharmacologic profile as A 9 -THC, however it is chemically more stable. Several can- 
nabinergic ligands possessing high affinities for both of CB receptors have been de- 
veloped recently (Table 5.1). One of the most successful compounds that resulted 
from this work was the Cl’-dithiolane analog AMG3; (12 in Table 5.1) exerting bind- 
ing affinity values of 0.32 and 0.52 nM for the CB1 and CB2 receptors, respectively. 85 

Hitherto, no direct observation of a CB ligand bound to a CB receptor using X-ray 
crystallography has been reported. 90 Therefore, active sites of these receptors have 
been postulated from many approaches, such as receptor binding analyses of a variety 
of CB derivatives using wild type and mutated receptor systems, molecular modeling 
analysis and 3D QSAR studies. 90 " 96 3D QSAR/CoMFA and CoMSIA techniques 9 ' 10 
have been successfully used previously in 3D QSAR studies of CBs and other 
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ligands. 72 ’ 90 ’ 94 ’ 97-103 The present study uses 3D QSAR/CoMFA and CoMSIA analyses 
on novel CB analogues (Table 5.1) 85 4 044 07 with a wide variation of biological activity 
(4000, and 1200-fold variances in bioactivity for the CB1 and CB2 receptors, respec- 
tively). Thus studied compounds are characterized by subtle structural variations and 
a wide range of biological activities which constitute an ideal base for 3D QS AR stud- 
ies. 

The selection of the lowest energy conformation of the bioactive conformation of the 
template molecule and the superimposition of all molecules on template compound 
are the two most critical steps in the 3D QSAR studies, especially for CoMFA and 
CoMSIA methodologies. These steps not only affect the output of the analysis, but 
they also contribute to the design of novel molecules. Among the synthesized ana- 
logues shown in Table 5.1, AMG3 was selected as a template, because: (i) it has the 
highest binding affinity at the CB1 receptor and forth highest binding affinity at the 
CB2 receptor, in the dataset; (ii) preliminary results of the low energy conformers of 
AMG3 have been reported using a combination of 2D NMR spectroscopy and mo- 
lecular modeling techniques. 89 

The knowledge of the receptor structure is not a prerequisite for 3D QSAR analysis, 
however, the availability of its crystal structure or 3D model facilitates the structure 
alignment, and can provide statistically more reliable models. 108 ' 109 3D homology 
models of the CB1 and CB2 receptors were constructed by Shim et a I.,'' Salo et a/. 34 
and Tuccinardi et al. r ' with a molecular modeling procedure using the X-ray structure 
of bovine rhodopsin 1 10 as the initial template and taken into account the available site- 
directed mutagenesis data. These groups studied different CB classes, however, they 
found them to interact at an active site with similar homologies. Therefore, in addition 
to the conformational analysis studies of template compound in solution, conforma- 
tional analysis studies have been also carried out at the binding sites of the receptors. 
Although, the more recent 3D receptor models were used for the molecular docking 
studies, the active site residues are determined considering both three models men- 
tioned above. 
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In order to include solvent and temperature effects to the conformational analysis of 
the template compound more sophisticated calculations have been performed both in 
solution and at the membrane associated receptors using MD simulations. The ob- 
tained conformations of template compound have been compared and for each con- 
former 3D QSAR models were created and statistical analysis calculations were per- 
formed. The effect of the increase of biological resemblance of a system to the de- 
rived statistical results from constructed QSAR models was investigated. 

Reported low energy conformations of A-THC analogues using NMR spectroscopy 
and molecular modeling studies in solution differed. 89 ' 90 ' 102 '" 1 The derived conforma- 
tions differed in the alkyl side chain or in the ABC tricyclic segments (e.g., conform- 
ed A-D in Figure 5.2). For example, in both conformers A and B, proposed for clas- 
sical A S -THC analogues, and shown in the Figure 5.2, the alkyl chain adopts an or- 
thogonal orientation relative to the horizontal plane of ring A, however B and C rings 
have different geometries (e.g., ring B has half chair-like and boat-like forms, in con- 
formers A and B, respectively). In conformer C, the alkyl chain has been extended 
away from the ABC tricyclic segment while in conformer D the alkyl chain has been 
wrapped towards the tricyclic part (Figure 5.2). 

Conformational studies of AMG3 using a combination of ID and 2D NMR spectro- 
scopies as well as molecular modeling techniques, showed that, the alkyl side chain 
adopts a perpendicular orientation relative to the horizontal plane of ring A of 
AMG3. 89 The conformation of the flexible l',l' dimethylheptyl side chain was also 
analyzed using a combination of theoretical studies and NMR experiments for classi- 
cal CB (-)-9-nor-9p-hydroxy(dimethylheptyl)-hexahydrocannabinol and nonclassical 
CBs CP47,497, CP55, 244 and CP55, 940 by Xie et al . 86 ’ 112 ’ 113 . Results showed that 
the alkyl side chain is almost perpendicular to the horizontal plane of the ring A. 
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Figure 5.2 (top) Molecular structure of AMG3 and; (bottom) its derived 3D low en- 
ergy conformers A-I. Dihedral angles of the alkyl side chain are assigned on the top 
structure (t,, C2-C3-C1’-C2'; t 2 , C3-C1'-C2'-C3’; t 3 , Cl'-C2'-C3’-C4'; t 4 , C2’-C3’-C4’- 
C5'; t 5 , C3'-C4'-C5'-C6'; t 6 , C4’-C5’-C6'-C7’). 


CBs are predicted to exert their biological action in the TM3-TM7 helices of 
GPCRs. 5 114 Reported experimental results suggest that biological activity of CBs can 
be related with two sequential criteria: (i) ‘proper’ topology and orientation of the 
drug in the membrane bilayer; (ii) diffusion and ‘appropriate’ fit of the drug in the 
receptor. 115 CBs are lipophilic molecules and are considered to first interact with the 
lipid microenvironment that surrounds the membrane-associated protein and then dif- 
fuse laterally at the active site of the receptor. 116 Therefore, it is important to under- 
stand the conformational properties of a drag molecule into the lipid microenviron- 
ment. Membranes do not lend themselves in a detailed analysis of their structural and 
dynamical properties by means of a single physicochemical method because of their 
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complexity and instability. 115 Thus, it is preferable to combine several experimental 
and/or computational methods seeking to obtain molecular information on the interac- 
tions of drugs with membranes. 11:1 Biophysical studies using different techniques 
(solid-state NMR spectroscopy, X-ray diffraction, Raman spectroscopy, IR spectros- 
copy, differential scanning calorimetry (DSC), etc.) in combination with molecular 
modeling studies assist in the determination of drug-membrane interactions and the 
role of membrane in the putative bioactive conformation of the drug molecule. 

The effect of the cannabinomimetic drug AMG3 on the thermotropic and structural 
properties of dipalmitoyl-.vn-glycero-S-phosphorylcholine (DPPC) liposomes have 
been studied by X-ray diffraction and DSC methodologies by Mavromoustakos et 
al. ni AMG3 was found to efficiently fluidize domains of the lipids in the Lp' gel 
phase and to perturb the regular multibilayer lattice. In the liquid crystalline L a phase, 
AMG3 was also found to cause irregularities in packing, suggesting that the drug in- 
duces local curvature. Biophysical studies by Makriyannis el a/. 115 have also provided 
detailed information for the topography, the stereochemistry, and the dynamic proper- 
ties of the CB ligand-membrane interactions by applying neutron diffraction, solid 
state NMR, DSC and small angle X-ray spectroscopy of A 8 -THC. 11 8-120 In these stud- 
ies, THC assumes an ‘awkward’ orientation in the bilayer with the long axis of its tri- 
cyclic system being perpendicular to the bilayer chains, while its aliphatic side chain 
orients parallel to the chains of the membrane phospholipids. 

Generated 3D QSAR models were used in de novo drug design program and allowed 
us to propose novel ligands with higher predicted binding affinity values. In order to 
determine the linear correlation coefficients between actual (measured) versus calcu- 
lated binding affinities, PLS statistical analyses of the data were used. CoMFA and 
CoMSIA contour plots were used to explain different structural requirements for CB 
binding to the CB1 and CB2 receptors. Derived contour plots can be used as pilot 
models for testing the designed novel analogues before their synthesis. 
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5.2 Results and Discussion 


5.2.1 Conformational Analysis of AMG3 in Vacuum and in Lipid Bilayer Envi- 
ronments (without Receptor) 

5.2.1.1 Selection of Low Energy Conformers of AMG3 Using MC Studies 

Low energy conformers of AMG3 are derived using MC conformational search 
analysis. The application of MC analysis, which allows full angular window specifi- 
cation and random change of dihedral angles of rotatable bonds, derived 1000 con- 
formers of AMG3. These conformers were clustered into eight different groups based 
on the dihedral angle criterion. The lowest energy conformers from each cluster have 
been selected for further analysis. The number of conformers for each of the eight 
clusters has been shown in the Table 5.2. Cluster analysis showed that conformer A (it 
has a perpendicular orientation of alkyl side chain relative to long axis of the ABC 
tricyclic segment) and wrapped conformer D (wrapped conformations are defined as 
those conformations adopting gauche 13 and T4 dihedral angles at the alkyl chain) have 
highest and smallest group member populations, respectively. Among eight lowest 
energy conformers, three conformers (A, C, and D) of AMG3 in Figure 5.2 are identi- 
cal with those reported conformations 89 ' 90 ' 102 '" 1 using experimental results and/or mo- 
lecular modeling techniques, while other five conformations (conformers E-I of 
AMG3 in Figure 5.2) differed. Conformer B was not obtained by MC cluster analysis. 

5.2.1.2 Geometry Optimization Calculations 

MM geometry optimization has been applied using Sybyl molecular modeling pack- 
age. 27 According to MM calculations, conformer B has significantly high total energy 
(~8.8 kcal.moF 1 ) due to its high torsional energy, while other conformers have a simi- 
lar energy plateau within the range of ~2.3 kcal.mol" 1 (Table 5.2). The high energy 
conformer B of AMG3 has been transformed to conformer A at the ah initio 
B3LYP/6-31G* ’ level optimizations. QM calculations show that all low energy 

conformers are almost isoenergetic (maximum total energy differences between the 
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conformers is ~2.5 kcal.mof 1 , Table 5.2). In order to examine the solvent effect over 
geometrical properties of conformers, the vacuum medium has been modified to am- 
phiphilic environment. The dielectric constant s was set to 45, in order to simulate an 
amphiphilic environment, which mimics physicological conditions and therefore it is 


Conformer 

Number of 
conformers 
in each ob- 
tained cluster 
by MC anal- 
ysis 

MM Relative Energy 
(kcal.mol 1 ) 

QM Geometry 
Optimization 
(B3LYP/6-31G*), 
Relative Energy 
(kcal.mof 1 ) 

Torsional 

Energy 

VDW 

Energy 

Electrostat. 

Energy 

Other Contribu- 
tions 3 

to MM Energy 

Total 

Energy 

A 

188 

5.520 

-3.821 

-5.820 

6.621 

2.499 

0.10 

B 


9.147 

-2.674 

-5.936 

8.211 

8.748 

0.10 

C 

147 

5.550 

-3.055 

-5.817 

6.688 

3.366 

1.26 

D 

42 

6.412 

-5.857 

-5.696 

7.647 

2.508 

2.56 

E 

102 

6.493 

-6.877 

-5.676 

8.157 

2.097 

0.00 

F 

156 

5.528 

-3.877 

-5.782 

6.558 

2.427 

0.00 

G 

92 

6.180 

-2.894 

-5.693 

6.773 

4.366 

1.57 

H 

134 

5.699 

-3.013 

-5.788 

7.120 

4.018 

2.03 

I 

139 

5.783 

-4.094 

-5.726 

7.507 

3.470 

1.48 

Average 6 

111 

6.257 

-4.018 

-5.770 

7.254 

3.723 

0.93 


“Other contributions include bond stretching energy, angle bending energy, out of plane bending energy. b Average results do 
not include conformer B for QM relative energies, because conforraer B is transformed to conformer A when QM geometry 
optimization is applied. 


Table 5.2 Number of conformers of AMG3 in each derived cluster by MC analysis 
and comparison of the total energies of conformers using MM and QM methods. 

appropriate for investigating biological structures. 12j It is observed that the effect of 
continuum model (e = 45) to the analyzed conformers compared to gas phase is very 
limited as it is depicted by the small RMSD value between the conformers in gas 
phase and in continuum model. Dihedral angles of the alkyl side chain segment of all 
conformers applying full geometry optimization with MM and QM methods in gas 
phase and in continuum model are presented in Table 5.3. 
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5.2.1.3 Rotational Energy Barrier Calculations 


In order to characterize the conformational flexibility properties of AMG3, rotational 
energy barriers were estimated using torsional grid scan analysis with semi empirical 
method PM3. ' Six rotatable torsional angles of AMG3, shown in Figure 5.2 and de- 
fined in its figure legend, were analyzed. The analysis is initiated with n and the en- 
ergetically lowest structure (optimal dihedral angle) is used for the next torsional an- 
gle analysis. Rotation around dihedral angles T 4 -T 6 showed similar energy profiles and 
rotational energy barriers are found to be ~4 kcaLmol" 1 . Their optimal dihedral angle 
was found to be -180° and the local minima were observed at -60° and -300°. Rota- 
tion around dihedral angles ii and 12 showed a rather more complex energy profiles 
due to the presence of the tricyclic ring segment. Rotation around dihedral angle 13 
showed the largest rotational energy barrier (~6 kcal.mof 1 ) and its optimal dihedral 
angle value was found to be -180° (Figure 5.3). 

5.2.1.4 MD Simulations of Conformers in Lipid Bilayer 

Computer simulations in general and MD simulations in particular, are of increasing 
importance in revealing details of molecular motions as well as structural and micro- 
scopic properties of the solution, which are difficult to measure experimentally . 125 
Heating increases the kinetic energy of the system which after equilibration at the 
given temperature overcomes any energy barriers close to the initial energy minimum. 
In order to examine the environmental effects over the structures, MD simulations 
were performed for all examined conformers with the Gromacs 3.3.1 software pack- 
age 36 in DPPC membrane bilayer environment. MD simulations were used in order to 
(i) further study the conformational space of AMG3 and (ii) explore the possibility of 
interconvertion between conformers in amphiphilic environments. 

MD in DPPC bilayer. Initial positions of the conformers in DPPC bilayer have been 
constructed according to experimental findings . 115 Alkyl chain of conformers have 
been inserted through the alkyl chains of the lipid (parallel orientation with lipid bi- 
layer chains ). 115 The tricyclic ABC segment of the AMG3 was localized close to the 
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head groups of DPPC bilayers and orienting their long axis perpendicular to the bi- 
layer plane (Figure 5.4). MD simulations analysis have shown that last four defined 
torsional angles (T 3 -T 6 ) at the alkyl side chain of AMG3 are all trans with some per- 
turbations around these angles for all of the analyzed conformers. Therefore, wrapped 
conformers (e.g., D and E) and conformers which have gauche dihedral angles within 
T 3 -T 6 (e.g., H and I) do not keep their initial values and turn to trans dihedral angles. 
(Torsional angle screening throughout the simulations for wrapped conformers D and 
E have been shown in Figure A4, Appendix). The dihedral angle xi shows more resis- 
tance to be transformed to another torsional angle through simulations. The dihedral 
angle 13 is the most flexible torsional angle in the alkyl chain and adopts gauche ± and 
trans dihedral angles, thus it determines the characteristics of the conformers. MD 
simulations of AMG3 in lipid bilayer environment produced three more low energy 
conformers in addition to previous ones. These conformers are called J, K and L and 
are shown in Figure 5.5 (their adopting dihedral angles of alkyl chain are presented in 
their corresponding figure legend). Trajectory analysis results showed that stable con- 
formers have gauche± for xi, gauche± and trans for X 2 and all trans conformations for 
X 3 -X 6 dihedral angles throughout the simulations. These stable conformers are A, C, F, 
J, K and L, and are considered for further investigations. 

These stable conformers form two favored plane angles which are are -90° and -140°. 
The first plane is defined by ring A (Figure 5.2), and the second plane is defined by 
the alkyl chain (Figure 5.6). Formed plane angles of favored conformations after MM 
and QM geometry optimizations and their relative energies are compared in the Table 
5.4. Relative energy differences are used for clarity and are defined on lowest total 
energy of conformers calculated by MM and QM methods. The lowest energy con- 
formers (J for QM and K for MM methods) are arbitrarily assigned with zero poten- 
tial energy and relative energies of other conformers are calculated. Both MM and 
QM calculations showed that, all favored conformations derived from MD simu- 
lateons in lipid bilayer have similar relative energies; however conformers which 
form perpendicular plane angle between tricyclic ring and flexible segments have 
slightly lower energy (~1 kcal/mol) than corresponding plane angle of -140°. 
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c 

o 

N 

F. 

Ti, (degree) 

t 2 , (degree) 

t 3 , (degree) 

T4 , (degree) 

t 5j (degree) 

t 6 , (degree) 

RMSD 

MM 

8=1 

MM 

8=45 

QM 

8=1 

QM 

8=45 

MM 

8=1 

MM 

8=45 

QM 

8=1 

QM 

8=45 

MM 

8=1 

MM 

8=45 

QM 

8=1 

QM 

8=45 

MM 

8=1 

MM 

8=45 

QM 

8=1 

QM 

8=45 

MM 

8=1 

MM 

8=45 

QM 

8=1 

QM 

8=45 

MM 

8=1 

MM 

8=45 

QM 

8=1 

QM 

8=45 

MM 

QM 

A 

73.8 

73.2 

63.3 

65.3 

57.8 

57.8 

61.1 

61.3 

178.9 

179.0 

181.1 

181.7 

179.7 

179.7 

180.5 

180.2 

180.0 

180.0 

180.2 

180.8 

180.0 

180.0 

180.1 

180.0 

0.25 

0.75 

B 

78.1 

78.1 

63.3 

65.3 

56.7 

56.8 

61.1 

61.3 

178.5 

178.5 

181.1 

181.7 

179.3 

179.3 

180.5 

180.2 

180.1 

180.1 

180.2 

180.8 

180.0 

180.0 

180.1 

180.0 

0.01 

0.75 

C 

82.0 

82.0 

81.5 

77.2 

181.0 

181.0 

184.0 

176.5 

179.8 

179.8 

185.3 

182.2 

180.0 

180.0 

180.6 

180.6 

180.0 

180.0 

180.5 

179.6 

180.0 

180.0 

179.9 

180.0 

0.00 

3.86 

D 

55.4 

54.8 

62.6 

63.1 

65.0 

65.2 

65.5 

65.3 

293.9 

294.3 

279.2 

278.9 

191.4 

191.3 

185.0 

185.0 

181.9 

181.8 

179.4 

179.4 

180.3 

180.2 

179.4 

179.5 

0.13 

0.25 

E 

55.7 

55.2 

61.4 

59.0 

68.9 

69.1 

66.7 

68.6 

300.8 

301.2 

281.5 

285.5 

191.0 

190.9 

186.8 

186.2 

176.5 

176.4 

176.5 

177.7 

63.9 

63.7 

65.9 

66.0 

0.12 

2.08 

F 

258.0 

257.6 

250.5 

244.1 

57.9 

58.0 

60.9 

60.8 

178.6 

178.6 

180.8 

180.7 

179.7 

179.7 

180.1 

180.2 

180.0 

180.0 

180.1 

180.0 

180.0 

180.0 

180.0 

180.0 

0.17 

2.61 

G 

187.4 

187.0 

187.2 

187.1 

176.9 

177.0 

175.6 

175.6 

176.6 

176.5 

177.8 

177.8 

179.5 

179.5 

179.6 

179.5 

179.9 

179.9 

179.5 

179.5 

180.0 

180.0 

180.1 

180.1 

0.17 

0.06 

H 

81.9 

82.2 

86.9 

83.0 

181.1 

181.1 

182.7 

184.1 

179.9 

179.9 

184.8 

185.5 

180.5 

180.6 

180.7 

181.0 

185.7 

185.7 

183.2 

183.5 

296.3 

296.4 

294.3 

294.4 

0.14 

1.71 

I 

103.1 

103.7 

116.1 

128.3 

302.9 

302.9 

298.0 

297.3 

182.6 

182.6 

178.9 

178.0 

186.5 

186.6 

184.2 

181.8 

302.0 

302.1 

296.6 

295.4 

302.3 

302.3 

297.2 

297.3 

0.25 

5.11 


Table 5.3 Values of dihedral angles corresponding to the alkyl chain part for low energy conformers of AMG3 derived by applying full geometry 
optimization with MM (left) and QM (right) with B3LYP/6-31G* level of theory. After optimization with QM, conformer B was transformed to 
conformer A, therefore they have identical dihedral angles for the corresponding method. 
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(i) (ii) (iii) 



(iv) (v) (vi) 

Figure 5.3 Rotational energy barriers of AMG 3 after applying grid scan analysis with consecutive optimization of dihedral angles 
(i) xi, (ii) t 2 , (iii) 13, (iv) 14, (v) 15 and (vi) Te. In figure, relative energies (differences of initial and final values of total energies with 
rotation) have been used instead of total energies for clarity. 
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Figure 5.4 MD simulations of AMG3 conformers in the lipid bilayer environment which 
consists of DPPC bilayers and water molecules. 



Figure 5.5 MD simulations in lipid bilayer produced three additional low energy confor- 
mations of AMG3 (named J, K and L) in addition to the presented conformers. Their di- 
hedral angles of alkyl chain ti-T 6 were derived after applying MM and QM geometry op- 
timizations in gas phase. These values are 103.9°, 302.1°, 181.2°, 180.2°, 180.0° and 
180.0°, respectively for conformer J; 282.8°, 302.2°, 181.4°, 180.3°, 180.0°, and 180.0°, 
respectively for conformer K; and 274.6°, 179.3°, 180.2°, 180.0°, 180.0° and 180.0°, 
respectively for conformer L. 
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Figure 5.6 Two favored plane angles between rigid and flexible segments of AMG3; -90° 
(on the left) and -140° (on the right) derived using MD simulations in lipid bilayer envi- 
ronment. 


Conformer 

MM-Plane Angle 
(degree) 

QM-Plane Angle 
(degree) 

MM-Relative 

Energy 

(kcal/mol) 

QM-Relative 

Energy 

(kcal/mol) 

A 

87.0 

89.0 

0.16 

0.16 

J 

87.5 

87.5 

0.23 

0.00 

F 

88.8 

86.6 

0.09 

0.15 

K 

89.4 

91.4 

0.00 

0.60 

C 

137.9 

135.5 

1.03 

1.40 

L 

141.4 

137.9 

0.81 

1.30 


Table 5.4 Plane angles between rigid and flexible segments and relative energies of fa- 
vored conformations of AMG3 in lipid bilayer. In table, relative energies (differences of 
total energies based on lowest total energy of conformers calculated by MM and QM 
methods; the lowest energy conformer of AMG3 with MM and QM methods are found to 
be K and J, respectively) have been used instead total energies for clarity. 

5.2. 1.5 3D QSAR/CoMFA Results 


The existing experimental findings combined with our molecular modeling studies assisted 
to obtain the template conformation of AMG3 for the constructions of 3D QSAR models. 
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Both MM and QM calculations resulted to favored conformers in lipid bilayer environment 
which perpendicular plane angle exists between ABC tricyclic ring and flexible alkyl chain 
segments. Initially, these conformers (A, J, F and K) have been used as template con- 
former and also compounds in the data base have been adopted with corresponding tem- 
plate compound conformation, subsequently statistical tests have been performed. The 
template conformer which showed optimum statistical results (conformer A) has been used 
for further analysis. 

Several variations in the alignment schemes by superimposing the similar pharmacophoric 
features were considered. Ci, C2, C3, C4, C4a, C6 a , C7, C10, Cio a , Ciob and the oxygen atoms 
in the template ligand AMG3 (Figure 5.2) were selected for the structural superimposition 
processes. The alignment of the molecules was based on atom-by-atom superimposition of 
selected atoms, which is common in all compounds. The criteria applied for the selection 
were: (i) overlap of the putative biologically relevant pharmacophore groups (with mini- 
mum RMS) and (ii) form of statistically significant 3D QSAR/CoMFA and CoMSIA mod- 
els. In order to build 3D QSAR/CoMFA and CoMSIA models for the binding affinity at 
the CB1 and CB2 receptors, a set of 30 CB analogues for the CB1 receptor and 29 CB ana- 
logues for the CB2 receptor were subjected to the cross-validated PLS analyses (Table 
5.1). Figure 5.7 illustrates the superimposition of CB analogues used as the training set to 
construct CoMFA and CoMSIA models. 



Figure 5.7 Structural alignments of the CB derivatives in the training set for constructing 
3D QSAR/CoMFA and CoMSIA pharmacophore models. 
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Table 5.5 shows the derived statistical results using CoMFA technique for CB1 and CB2 
pharmacophore models. The CoMFA study based on the selected lowest energy con- 
former of template ligand which was determined by a combination of experimental and 
molecular modeling studies, gave r c 2 v values of 0.784 and 0.572 for CB1 and CB2 recep- 
tors, respectively. The non-cross-validated PLS analysis yielded an r 2 of 0.981 and 0.972; 
and the estimated standard errors were 0.173 and 0.187 for CB1 and CB2 receptors, re- 
spectively (Table 5.5). Therefore, the CoMFA-generated 3D QSAR models for the binding 
affinities of CB analogues at CB1 and CB2 receptors have a very good cross-validated cor- 
relation. Table 6 shows the relationship between the experimental and CoMFA-calculated 
pK; (-logKi) values of the non-cross-validated analyses for CB1 and CB2 receptors. Linear- 
ity of the plots at Figure 5.8 shows very good correlations between observed and predicted 
affinities for CoMFA models which developed in study for the binding affinities of CB 
ligands for the CB1 and CB2 receptors. 



CoMFA/CBl 

CoMFA/CB2 

Number of compounds in training set 

30 

29 

r cv 

0.784 

0.572 

r 

0.981 

0.972 

Standard error of estimate 

0.173 

0.187 

F 

197.531 

127.260 

Relative contributions of 
steric/electrosatic fields 

0.640:0.360 

0.632:0.368 

Number of optimal 
components 

6 

6 


Table 5.5 Statistical results obtained by 3D QSAR/CoMFA models for CB1 and CB2. 

The contour maps were used to create a ‘negative’ matrix in the place of the unknown ac- 
tive site and variations of the used ligands can be generated as long as they fit better into 
the ‘imaginary’ active site. Figure 5.9 shows the steric-electrostatic contour maps of the 
CoMFA models for CB1 and CB2 receptors. The individual contributions from the steric 
and electrostatic favored and disfavored levels are fixed at 80% and 20%, respectively. The 
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CoMFA contours of the steric maps are shown in yellow and green colors, and those of the 
electrostatic contour maps are shown in red and blue colors. Greater values of “bioactive- 
measurement” are collected with: more bulky groups near the green colored contours; less 
bulky groups near the yellow colored contours; more positive charge near the blue colored 
contours, and more negative charge near the red colored contours. 


Compound 


CB1 CoMFA model 


pK; (observed) pK;(predicted) 


CB2 CoMFA model 

pK; (observed) pK;(predicted) 


1 

7.02 

7.16 

7.14 

7.25 

2 

6.20 

6.18 

6.43 

6.50 

3 

6.92 

7.09 

7.29 

7.19 

4 

7.24 

7.13 

6.97 

7.13 

5 

7.93 

7.86 

8.03 

7.98 

6 

6.12 

6.20 

6.65 

6.64 

7 

7.55 

7.69 

7.60 

7.87 

8 

6.59 

6.63 

6.98 

7.03 

9 

8.08 

8.09 

8.41 

8.27 

10 

6.50 

6.56 

6.96 

7.03 

11 

6.77 

6.66 

6.99 

6.95 

12 

9.49 

9.34 

9.28 

8.96 

13 

6.87 

6.89 

7.30 

7.39 

14 

9.28 

9.40 

9.66 

9.68 

15 

7.24 

7.17 

6.59 

6.62 

16 

8.74 

8.74 

8.44 

8.44 

17 

7.49 

7.61 

7.71 

7.68 

18 

9.35 

9.50 

8.72 

9.05 

19 

7.32 

7.33 

7.41 

7.51 

20 

5.90 

5.80 

6.64 

6.43 

21 

7.66 

7.73 

- 

- 

22 

9.08 

8.99 

9.31 

9.04 

23 

9.36 

9.13 

9.07 

9.12 

24 

7.23 

6.88 

7.00 

7.16 

25 

8.90 

9.19 

9.54 

9.35 

26 

6.18 

6.53 

7.48 

7.19 

27 

9.15 

9.11 

8.99 

9.29 

28 

6.72 

6.58 

7.20 

7.26 

29 

7.66 

7.52 

7.08 

6.93 

30 

8.66 

8.54 

8.48 

8.39 


Table 5.6 Observed and predicted pK, (by CoMFA models) values at the training set for 
CB1 and CB2 receptors. 
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Experimental and calculated pKi values of cannabinoid analogs at the CB 1 receptor Experimental and calculated pKi values of cannabinoid analogs at CB2 receptor 
CoMFA analysis CoMFA analysis 




Figure 5.8 Plots of corresponding CoMFA-predicted and experimental values of binding 
affinity (given as pK;) of CB analogues in the training set at the CB1 (on the left) and CB2 
(on the right) receptors, respectively. 

5.2. 1.6 3D QSAR/CoMSIA Results 

Several combinations of stereoelectronic fields (steric, electrostatic, H-bond donor, H-bond 
acceptor and hydrophobicity fields) of 3D QSAR/CoMSIA models were obtained from the 
compounds in the data set. The optimal 3D QSAR/CoMSIA model was derived by the 
combination of steric and electrostatic potential fields. This model based on the selected 
lowest energy conformer of template ligand AMG3, gave r c 2 v values of 0.746 and 0.625 for 

the CB1 and CB2 receptors, respectively (Table 5.7). The non-cross-validated PLS analy- 
sis yielded an r of 0.944 and 0.912, and the estimated standard errors were 0.296 and 
0.324 for CB1 and CB2 receptors, respectively (Table 5.7). Relationship between the 
CoMSIA-predicted and experimental pKj values of the non-cross-validated analyses for 
CB1 and CB2 receptors have been shown in Table 5.8 and Figure 5.10. 

Figure 5.11 shows the steric-electrostatic contour maps of the CoMSIA models for the 
CB1 and CB2 receptors. Since the CB analogues used in the training set differ mainly in 

o 

the Cl' position and the tricyclic part of A -THC or the CBD skeleton, the contour plots 
place more emphasis to these regions. 
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Figure 5.9 (top) CoMFA contour maps of template compound 12 (AMG3) (on the left) 
and its corresponding CBD analogue 13 (on the right) for the CB 1 model. (Regions I, II, 
and III show contour maps around alkyl side chain, tricyclic part and a-face of Cl' of 
ligand, respectively), (bottom) CoMFA contour maps of template compound 12 (on the 
left) and its corresponding CBD analogue 13 (on the right) for the CB2 model. 



CoMSIA/CB 1 

CoMSIA/CB 2 

Number of compounds in training set 

30 

29 


0.746 

0.625 

r 

0.944 

0.912 

Standard error of estimate 

0.296 

0.324 

F 

65.031 

47.855 

Relative contributions of 
steric/electrosatic fields 

0.890:0.110 

0.918:0.082 

Number of optimal components 

6 

5 


Table 5.7 Statistical results obtained by 3D QSAR/CoMSIA models for CB1 and CB2. 
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5.2. 1.7 Discussion 


When correlations are sought using reported data, one must take into account (i) large vari- 
ability in testing procedures and (ii) uncertainties related to enantiomeric purities of syn- 

c qa oc 

thetic molecules. A careful examination of published data ' ' identifies essential molecu- 
lar fragments contributing to ‘cannabimimetic activity’. One of them is the aliphatic C3- 
alkyl side chain; the role of this pharmacophore is important for hydrophobic interactions 
with the site(s) of action. There is an established SAR which indicates longer side chains 
are correlated with more potent CBs . 84 ' 1 " 6 Decreasing the length of the n-pentyl side chain 
of A 9 -THC by two carbons, reduces potency by 75%, and extension of the five carbon 
atom chain by adding one or two carbons favors binding, while further extension is detri- 
mental. Interestingly, analogues with substituents, e.g. CH 3 , C 2 H 5 , Cl, or I in the ortho po- 
sition to the phenolic hydroxyl, retain substantial biological activity, however the para sub- 
stitution produces inactive analogues . 84 ' 126 Accordingly, para substituents prevent the side 
chain from orienting to a southern direction with respect to the phenolic hydroxyl group, 
resulting in decreased CB activity. On the other hand, ortho substitution allows such an 
orientation. Thus, the orientation of the alkyl side chain plays an important role in the 
determination of biological activity. A significant degree of conformational restriction can 
be imposed upon the alkyl side chain either by the introduction of a double bond or a new 
cyclic ring fused to the aromatic ring A, leading to variations in biological responses. 
Khanolkar et al. presented a series of A -THC analogues, in which the 77 -heptyl side chain 
was restricted by a C2-C3 cyclohexyl ring, and showed that the alkyl side chain pointing 
downwards has a 18-fold higher binding affinity for the CB1 receptor and a 3 -fold higher 
binding affinity for the CB2 receptor than the respective analogue in which the alkyl side 
chain orients laterally. 
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Compound 

CB1 CoMSIA model 

pKi (observed) pKi(predicted) 

CB2 CoMSIA model 

pKi (observed) pKq(predicted) 

1 

7.02 

7.19 

7.14 

7.31 

2 

6.20 

6.12 

6.43 

6.43 

3 

6.92 

7.03 

7.29 

7.19 

4 

7.24 

6.98 

6.97 

7.15 

5 

7.93 

7.54 

8.03 

7.74 

6 

6.12 

6.40 

6.65 

6.75 

7 

7.55 

7.63 

7.60 

7.97 

8 

6.59 

6.51 

6.98 

7.01 

9 

8.08 

7.83 

8.41 

8.18 

10 

6.50 

6.51 

6.96 

7.08 

11 

6.77 

6.91 

6.99 

7.04 

12 

9.49 

9.00 

9.28 

8.68 

13 

6.87 

6.89 

7.30 

6.99 

14 

9.28 

9.20 

9.66 

9.18 

15 

7.24 

7.15 

6.59 

6.99 

16 

8.74 

8.45 

8.44 

8.43 

17 

7.49 

8.03 

7.71 

8.10 

18 

9.35 

9.82 

8.72 

9.32 

19 

7.32 

7.33 

7.41 

7.46 

20 

5.90 

5.98 

6.64 

6.45 

21 

7.66 

7.95 

- 

- 

22 

9.08 

9.05 

9.31 

9.24 

23 

9.36 

9.13 

9.07 

9.18 

24 

7.23 

6.74 

7.00 

7.15 

25 

8.90 

9.19 

9.54 

9.35 

26 

6.18 

6.65 

7.48 

7.37 

27 

9.15 

9.10 

8.99 

9.33 

28 

6.72 

6.61 

7.20 

7.38 

29 

7.66 

7.82 

7.08 

6.78 

30 

8.66 

8.55 

8.48 

8.26 


Table 5.8 Observed and predicted pIQ (by CoMSIA models) values at the training set for 
CB1 and CB2 receptors. 


The CB1 and CB2 receptors belong to the same receptor family and exhibit a 44% se- 
quence homology, which rises to 68% in the TM domains, an area thought to be involved 
in ligand recognition. 5 Because of this high degree of homology, it is not surprising that 
binding affinities for CB1 and CB2 receptors are correlated. Figures 9 and 11 show the 
field contributions to the binding affinity among the CBs and provide a visualization of 
both steric and electrostatic interactions at the receptor site. The result demonstrates the 
importance of the hydrophobic components of the CBs with cannabimimetic activity and is 
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Figure 5.10 Plots of corresponding CoMSIA-predicted and experimental values of binding 
affinity (given as pK;) of CB analogues in the training set at the CB1 (on the left) and CB2 
(on the right) receptors, respectively. 

consistent with other studies. The CoMSIA results are in agreement with the CoMFA re- 
sults. The contour maps resulted by applying CoMFA and CoMSIA methodologies dem- 
onstrate that there are similar and different structural requirements for optimum ligand 
binding at the CB1 and CB2 receptors. Derived 3D contour maps of CoMFA and CoMSIA 
models were investigated in the three distinct regions: 

Alkyl side chain-Molecular segment I: The green colored contours along the left side of the 
end of the alkyl chain show that bulky groups enhance the binding affinity for the CB 1 and 
CB2 receptors in both CoMFA and CoMSIA models (Figures 5.9 and 5.11). For example, 
the presence of adamantane, phenyl, t-butyl, isopropyl, or cyclopentyl groups in this region 
is expected to enhance CB1 and CB2 receptor binding affinities. There are large yellow 
colored contours on the right side of the end of the alkyl side chain in the CB1 and CB2 
CoMSIA models (Figure 5.11), and small areas for the corresponding CB1 and CB2 
CoMFA models (Figure 5.9) showing the existence of sterically unfavorable fields (the 
areas in which steric bulk is predicted to decrease binding). Thus, the orientation of the al- 
kyl side chain plays an important role in determining biological activity. This result con- 
firms the previous published reports. 84 ' 104 
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Figure 5.11 (top) CoMSIA contour maps of template compound 12 (AMG3) (on the left) 
and its corresponding CBD analogue 13 (on the right) for the CB 1 model. (Regions I, II, 
and III show contour maps around alkyl side chain, tricyclic part and a-face of Cl' of 
ligand, respectively), (bottom) CoMSIA contour maps of template compound 12 (on the 
left) and its corresponding CBD analogue 13 (on the right) for the CB2 model. 

Compounds 12, 14, 16, 18, 22, 23, 25, and 27 show high activity but low selectivity for the 
CB1 and CB2 receptors attributed to their fit in the hydrophobic subsite of both receptors. 6 
An optimal interaction is observed when a lipophilic group is attached to Cl' position. The 
CB1 receptor appears insensitive to isosteric groups attached to the Cl' position whereas 
the CB2 receptor shows a higher preference for the smaller dioxolane five-membered ring 
rather than the dithiolane ring or more hydrophobic cyclopentyl analogues. 67 
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ABC ring -Molecular segment //: The yellow colored contour at the a-face of the C-ring in 
the A S -THC analogues (Figures 5.9 and 5.11, on the left) indicates areas in which steric 
bulk is predicted to decrease binding. However, in the case of CBD analogues this area fits 
on the C9-methyl group (Figures 5.9 and 5.11, on the right). Bulky groups localized be- 
tween molecular segments I and II are expected to reduce the binding affinities of CB ana- 
logues at both CB1 and CB2 receptors. In these regions, the steric interactions affect dif- 

o 

ferently the binding affinities of A -THC and CBD analogues for the CB1 and CB2 recep- 

o 

tors in both the CoMFA and CoMSIA models. In A -THC analogues, a sterically unfavor- 
able area (yellow colored contour) is located between the regions I and II (Figures 5.9 and 
5.11, on the left). In the case of CBD analogues, because of the different structural orienta- 
tion of the bicyclic segment, this area fits on the methyl and propenyl groups (Figures 5.9 

o 

and 5.11, on the right). If the binding affinity value of A -THC analogues and their respec- 
tive CBD analogues is compared, CBD analogues usually have lower binding affinities 

o 

than their corresponding A -THC analogues. For example, the template compound 12 , has 
425 -fold and 97 -fold higher binding affinities than its respective CBD analogue 13 , for 
CB1 and CB2 receptors, respectively. This can be explained by different topographical re- 

o 

quirements for the A -THC and CBD derivatives at the cyclic ring segment. The CB1 re- 
ceptor is more sensitive than the CB2 receptor to this different structural orientation, be- 
cause in this region the sterically unfavorable area (yellow colored contour) is larger at the 
CB1 model (Figures 5.9 and 5.11). 

a-face of Cl '-Molecular segment 111 : Sterically unfavorable contour (yellow colored) is 
localized in the vicinity of ring A (Figures 5.9 and 5.11). Therefore, the existence of bulky 
groups in this molecular segment results in the decrease of the binding affinity as it is con- 
firmed by compounds 15 and 16 . Figure 5.12 shows the steric-electrostatic CoMSIA con- 
tour maps of compound 15 for CB1 and CB2 receptors, respectively. The contour maps 
show that the increased binding affinity and pharmacological potency are associated with 
bulky (green colored contours) and negatively charged groups (red colored contours) in the 
a-face of Cl', (Figures 5.9 and 5.11). The presence of such groups (e.g., C 6 H 5 OH, 


94 



C 6 H 5 CF 3 , C 6 H 5 CCI 3 , C 6 H 5 CI 3 , C 6 H 5 I, etc.) in this region, are expected to enhance CB1 and 
CB2 receptor binding affinities. 

The electrostatic contour maps that were correlated with the predicted potency were seen 
in the a-face of Cl' (molecular segment III) for the both of CoMFA and CoMSIA models 
and in the middle of the alkyl side chain (molecular segment I) predominantly in the 
CoMFA models. Results show that in the a-face of Cl' and in the middle of the alkyl side 
chain, ligands may interact with corresponding electropositive and electronegative atoms 
of CB1 and CB2 receptors, respectively (Figures 5.9 and 5.11). 

In order to test the predictive ability of the obtained CoMFA and CoMSIA models, 20 

o 

other A -THC analogues have been added to the training set for the CB1 model and 13 

o 

other A -THC analogues have been added to the training set for the CB2 model. (Binding 
affinities have been taken from reported values in the literature ’ . Binding affinities of 7 
CB analogues have been measured only for the CB1 receptor). The same CoMFA and 
CoMSIA settings and PLS analyses with initially derived models have been performed for 
the re-constructed CoMFA and CoMSIA models. Compound 12 has been used as a tem- 
plate and same atoms in the CoMFA and CoMSIA models have been selected for the struc- 
tural superimposition processes for re-constructed models. 



Figure 5.12 CoMSIA contour maps of 15 for the CB1 (on the left) and the CB2 models 
(on the right). 
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o fin 

Table 5.9 shows the added A -THC analogues from literature ' to the training set. Figure 
5.13 illustrates the structural superimposition of CB analogues for the re-construction of 
derived CoMFA and CoMSIA models. 

In order to re-build 3D QSAR models for the binding affinity at the CB1 and CB2 recep- 
tors, a training set of 50 CB analogues for CB1 receptor, and 43 CB analogues for CB2 
receptor were included in the cross-validated PLS analyses. Table 5.10 shows the cross- 
validated and non-cross-validated r values in all CoMFA analyses. The CoMFA study, 
based on selected lowest energy conformer of AMG3, gave cross-validated r values of 
0.770 and 0.614 for CB1 and CB2 models, respectively. The non-cross-validated PLS 
analysis yielded an r of 0.955, and 0.926 and the estimated standard errors were 0.242 and 
0.289 for CB1 and CB2 models, respectively. Thus, the CoMFA re-generated 3D QSAR 
models for the binding affinities to the receptors CB1 and CB2 has a very good cross- 
validated correlation. Figure 5.14 shows the relationship between the calculated and ex- 
perimental pK; values of the non-cross-validated analyses for CB1 and CB2 receptors. Fig- 
ure 5.15 shows the steric-electrostatic contour maps of the re-generated CoMFA models 
for CB1 and CB2 receptors. 

Table 5.11 shows the derived statistical results from re-constructed CoMSIA analysis. The 
CoMSIA study, based on AMG3 as template, gave cross-validated r 2 values of 0.740 and 
0.572 for models CB1 and CB2, respectively. The non-cross-validated PLS analysis 
yielded an r 2 of 0.891, and 0.883 and the estimated standard error was 0.369 for both CB1 
and CB2 models. Figure 5.16 shows the relationship between the calculated and experi- 
mental pK; values of the non-cross-validated analyses for CB1 and CB2 receptors. Figure 
5.17 shows the steric-electrostatic contour maps of the re-obtained CoMSIA models for 
CB1 and CB2 receptors. 


96 








Figure 5.13 Structural superimpositions of the compounds for the re-construction of 
CoMFA and CoMSIA models. 



CoMFA/CB 1 

CoMFA/CB2 

Number of compounds in training set 

50 

43 

c; 

0.770 

0.614 

r 

0.955 

0.926 

Standard error of estimate 

0.242 

0.289 

F 

152.826 

90.517 

Relative contributions of 
steric/electro static fields 

0.606:0.394 

0.599:0.401 

Number of optimal 
components 

6 

5 


Table 5.10 Cross-validated and non-cross-validated analyses at the CB1 and CB2 recep- 
tors using the re-obtained CoMFA models, based on the AMG3 CB analogue used as the 
template. 
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Experimental and calculated pKi values of cannabinoid analogs at the CB I receptor Experimental and calculated pKi values of cannabinoid analogs at die CB2 receptor 
CoMFA analysis CoMFA analysis 




Figure 5.14 Plots of corresponding re-obtained CoMFA-calculated and experimental val- 
ues of binding affinity (given as pK;) of CB analogues in the training set at the CB 1 (left) 
and CB2 (right) receptors, respectively. 


The results from re-obtained models did not significantly modify the initially obtained 
models. The re-constructed 3D QSAR/CoMFA and CoMSIA models for the binding af- 
finities to the CB1 and CB2 receptors have a very good cross-validated correlation. Al- 
though there are minor differences between initial and re-constructed CoMFA and CoM- 
SIA models, the overall emerging picture is consistent. The main topographical require- 
ments in the re-constructed CoMFA and CoMSIA models confirm the initially obtained 
models for the CB 1 and CB2 receptors. The predictive ability of the initial model has been 
tested with added compounds, and it was shown that the model is able to accurately predict 
them as true unknowns. 
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Figure 5.15 (top) CoMFA contour maps of template ligand 12 (AMG3, left) and its corre- 
sponding CBD analogue 13 (right) for the re-obtained CB 1 model, (bottom) CoMFA con- 
tour maps of 12 (left) and its corresponding CBD analogue 13 (right) for the re-obtained 
CB2 model. 


101 





CoMSIA/CB 1 

CoMSIA/CB2 

Number of compounds in training set 

50 

43 

r cv 

0.740 

0.572 

r 

0.891 

0.883 

Standard error of estimate 

0.369 

0.369 

F 

71.836 

44.096 

Relative contributions of 
steric/electro static fields 

0.947:0.053 

0.921:0.079 

Number of optimal 
components 

5 

6 


Table 5.11 Cross-validated and non-cross-validated analyses at the CB1 and CB2 recep- 
tors using the re-obtained CoMSIA models, based on the AMG3 CB analogue used as the 
template. 


Experimental and calculcicd pKi values of cannahinoid analogs at the CB 1 receptor Experimental and calculated pKi values of cannabinoid analogs at the CB2 receptor 
CoMSIA analysis CoMSIA analysis 




Figure 5.16 Plots of corresponding re-obtained CoMSIA-calculated and experimental val- 
ues of binding affinity (given as pK;) of CB analogues in the training set at the CB 1 (left) 
and CB2 (right) receptors, respectively. 
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Figure 5.17 (top) CoMSIA contour maps of template ligand 12 (AMG3, left) and its cor- 
responding CBD analogue 13 (right) for the re-obtained CB1 model, (bottom) CoMSIA 
contour maps of 12 (left) and its corresponding CBD analogue 13 (right) for the re- 
obtained CB2 model. 

5.2.2 Conformational Analysis of AMG3 at the Binding Site of the Receptor 
5.2.2.1 Molecular Docking Studies 

3D models of the CB1 and CB2 receptors were constructed by several groups (e.g.; Salo et 
al., “ Shim et al. and Tuccinardi et al. ) with a molecular modeling procedure, using the 
X-ray structure of bovine rhodopsin 110 as the initial template and taken into account the 
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available site-directed mutagenesis data. 3D models of CB receptors were obtained from 

IT 

Tuccinardi et al. and used in the in silico docking simulations; however, the critical 
amino acids for the CB binding are determined considering reported CB models studied 
mentioned above. In addition, ligand binding pockets of the receptors have been obtained 
by the Biopolymer module of Sybyl molecular modeling package.' Figure 5.18 shows the 
proposed binding pockets for the CB1 and CB2 receptors. Two binding pockets in the CB1 
and five binding pockets in the CB2 receptors have been determined. The found largest 
cavities of the receptors include same positions of the critical amino acids reported in the 
literature. Since conformers A, C, F, J, K and L are found as the most favored stable con- 
formers through MD simulations in solution (DPPC bilayer environment without receptor), 
flexible docking has been employed to these six conformers using FlexX docking algo- 
rithm of Sybyl molecular modeling package.' The mean, the highest and the lowest values 
of the best 30 binding scores for the each complex of the CB1 and CB2 receptors and 
AMG3 conformers and the standard deviations between the scores are presented in Table 
5.12. Among the conformations, the conformer C of AMG3 shows the best binding com- 
plex with the active site residues of both CB1 and CB2 receptors. Figure 5.19 (top) shows 
the localization of AMG3 ligand at the binding site of the CB1 (left) and CB2 (right) re- 
ceptor models. Core of TM3-TM7 helices mainly participate to the binding cavities. Figure 
5.19 (bottom) shows the interactions of binding site residues with the AMG3. The bioac- 
tive CB ligand AMG3 stabili z es its interactions with the active site through non-bonding 
van der Waals interactions with the non-polar surfaces of the active site residues of CB1 
receptor (i.e., Luel93, Phe200, Thr201, Ile247, Pro251, Thr283, Trp356, Leu360, Val387) 
and CB2 receptor (i.e., Phell7, Leul94, Leu255, Trp258). 
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Figure 5.18 Proposed binding pockets for CB1 (left) and CB2 (right) receptor models. 


The main characteristic of the low energy conformers of AMG3, both in solution and at the 
active site of the receptor is the high flexibility of the alkyl side chain. This is eminent by 
the low energy barriers observed in the various low energy rotamers of the molecule. It is 
noticed that the CB1 receptor has two available binding pockets (named as SI and S2) for 

the accommodation of CB ligands (Figure 5.20i). SI and S2 binding pockets constitute two 

00 

cavities of ~7 A and ~10 A depths, respectively. They can both accommodate the alkyl 
side chain segment of CB ligands. Our findings are in accordance with previous re- 
ports 5 ' 6 ' 67 , which show that extension of the five carbon atom chain (~7 A) of THC by one 

o o 

or two carbon atoms (-10 A) improves binding, while further extension (>10 A) is detri- 
mental due to steric hindrance. However, CB2 receptor has only one ligand binding pocket 
(Figure 5.20ii). Population analysis of docking modes for both CB1 and CB2 receptors 
showed that conformer C has highest propensity to bind at the active site of the CB 1 and 
CB2 receptors. 
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Figure 5.19 (top) Ligand location at the active site of the receptors CB1 (left) and CB2 
(right); (bottom) AMG3 stabilizes its binding mainly through van der Waals interactions 
with the non-polar surfaces of the active site residues of CB1 receptor (left), (i.e., Luel93, 
Phe200, Thr201, Ile247, Pro251, Thr283, Trp356, Leu360, Val387) and CB2 receptor 
(right), (i.e., Phell7, Trpl94, Leu255, Cys257, Trp258). 
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(i) 


Conformer 

Mean 

Best 

Worst 

Std. 

Dev. 

A 

-9.93 

-11.40 

-9.26 

0.60 

C 

-10.15 

-11.43 

-9.50 

0.56 

F 

-9.50 

-10.86 

-8.90 

0.46 

J 

-9.77 

-11.14 

-9.18 

0.49 

K 

-9.55 

-10.67 

-9.01 

0.44 

L 

-9.55 

-11.15 

-9.07 

0.49 


(ii) 


Conformer 

Mean 

Best 

Worst 

Std. 

Dev. 

A 

-9.82 

-11.50 

-9.31 

0.48 

C 

-10.26 

-12.52 

-10.26 

0.69 

F 

-8.38 

-9.22 

-8.15 

0.25 

J 

-10.29 

-11.80 

-9.73 

0.55 

K 

-9.35 

-11.15 

-9.35 

0.66 

L 

-8.90 

-10.66 

-8.26 

0.66 


Table 5.12 The mean, the highest and the lowest values of the best thirty binding scores 
for each complex of the (i) CB1 and (ii) CB2 receptors with AMG3 conformers, as well as 
the standard deviations between the binding scores. 


5 . 22.2 Molecular Specificity for the SI and S2 Binding Pockets at CB1 Receptor 

Since AMG3 has conformational flexibility that can accommodate SI and S2 binding 
pockets, it should be of great interest to study the conformational preferences of these two 
cavities located at the binding site of the receptor. Design of molecules with specific pref- 
erence to either site may be of biological significance. S2 is deeper than SI pocket and ac- 
commodates preferably the all trans conformation of the alkyl side chain segment of 
ligands. Unsaturation of the alkyl side chain of compounds leads their orientations towards 
S2 pocket. For this reason an analogue of AMG3 with four unsaturated bonds at the alkyl 
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chain was designed in order to specifically be directed to S2 cavity of the CB1 receptor 
(Figure 5 . 2 1 i) . 



Figure 5.20 (top) Two cavities SI and S2 observed at the active site of the CB1 receptor: 

o o 

SI and S2 pockets constitute two cavities that have ~7 A and ~10 A depths, respectively 
and they accommodate the alkyl chain segment of CBs. (bottom) AMG3 ligand location at 
the CB2 receptor. MOLCAD lipophilic potential surface was calculated for the receptor 
with the Connolly method. Brown color denotes the most lipophilic areas and blue color 
denotes the most polar areas. 


These predictions have been tested with the docking trials of novel analogues possessing 
unsaturation of alkyl side chain of AMG3. Imposing double bond unsaturation to the C2'- 
C3' single bond of alkyl chain of AMG3, is not enough for forcing the side chain towards 
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lateral orientation (top-left on the Figure 5.21 ii) . However further unsaturations e.g., addi- 
tion of double bonds to C4'-C5' (top-right on the Figure 5.21 ii) and C6'-C7' bonds (mid-left 
on the Figure 5.21 ii) leads the orientation of unsaturated alkyl chain towards S2 binding 
pocket. The four unsaturated bonded analogue has optimum alkyl side chain length for the 
S2 ligand binding pocket (mid -right on the Figure 2 1 ii) . Further extension is detrimental 
(bottom on the Figure 5.21 ii) . Figure 5 . 2 1 iii depicts the total FlexX binding scores versus 
number of double bonds at the alkyl side chain of AMG3 analogues mentioned above. Dif- 
ferent unsaturation patterns have been also studied at the alkyl side chain of AMG3 (e.g., 
C2'-C3' and C5'-C6'), in order to cover all possible probabilities without accounting for the 
synthetic difficulty. The docking results showed that imposing double bond unsaturations 
to the C2'-C3' and C5'-C6' single bonds of alkyl chain of AMG3, is not enough to orient to 
the direction of S2 site. These observations may help open new avenues to synthetic chem- 
ists for synthesizing novel compounds. 
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(i) 
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(iii) 



Number of double bonds in the alkyl chain ofthe ligand 

* The best score obtained from 30 docking solutions. 

Figure 5.21 (i) Unsaturation of alkyl chain leads the orientation of chain towards S2. For 
this reason a structure was designed possessing four unsaturated bonds which were di- 
rected specifically to S2 cavity, (ii) Docking of rationally designed AMG3 analogues pos- 
sessing double bonds at the alkyl side chain of the CB 1 receptor. The degree of unsatura- 
tion is critical for the design of analogues to orient towards SI or S2 binding cavity, (iii) 
Total FlexX binding scores versus number of double bonds at the side chain. The optimal 
number of double bonds at the alkyl side chain is four at S2 cavity. 

5.2.23 Second Generation of 3D QSAR Models Based on in Silico Docking Results 

After acquiring the highest percentage of conformer of AMG3 (conformer C) at the active 
site of the CB1 and CB2 receptors, the effect of used template conformation to the derived 
QSAR models was evaluated. For comparison, 3D QSAR/CoMSIA models with smoother 
potential functions have been used for the re-generation of CB1 and CB2 models. 

Same common atoms (Ci, C2, C3, C4, C4a, C6a, C 7 , C10, Cio a , Ciob and the oxygen atoms in 
the template conformers of AMG3) with the initial models were selected for superimposi- 
tion processes. Figure 5.22 shows the superimpositions of CB analogues used as a training 
set to construct 3D QSAR/CoMSIA models based on the conformer C of template ligand 
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(for comparison, superimpositions of CB analogues based on conformer A of template li- 
gand is also presented). 

In order to build the CB1 and CB2 3D QSAR/CoMSIA models, a set of 30 A S -THC and 
CBD analogues (Table 5.1) were analyzed. The pK; values were used in the 3D QSAR cor- 
relations, and cross-validated PLS analyses were applied. Steric and electrostatic field col- 
umns of CoMSIA were generated. The same CoMSIA settings, PLS analyses and valida- 
tions have been applied as initially generated models. A very high correlation was ob- 
served for both models as it is demonstrated by the high values of r" (Table 5.13). Addi- 
tionally, the credibility of the models is proved by the high values of r c 2 v (Table 5.13). 



Figure 5.22 Structural alignments of the compounds in the training set for constructing 3D 
QSAR/CoMSIA models based on conformers A (on the left) and C (on the right) of the 
template ligand 12 . 

Table 5.14 summarizes the experimental (observed) and CoMSIA-calculated pK; results 
for the binding affinities at the CB1 and CB2 receptors. Figure 5.23 shows the relationship 
between the 3D QSAR/CoMSIA predicted and experimental pKj values of the non-cross- 
validated analyses for the constructed models based on conformers A (left, top) and C 
(right, top) of 12 for CB 1 receptor and corresponding conformers (A, left, bottom; and C, 
right, bottom) of 12 for CB2 receptors. The linearity of the plot concerning conformer C 
was better than the linearity of the plot concerning conformer A. Both plots showed good 
correlations for the constructed models. 
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CB1 model -initial 
model (template 
ligand 12- 
conformer A) 

CBl model 
(template ligand 
12-conformer C) 

CB2 model - 
initial model 
(template 
ligand 12- 
conformer A) 

CB2 model 
(template ligand 
12-conformer C) 

Number of com- 
pounds in the training 
set 

30 

30 

29 

29 

r cv 

0.746 

0.764 

0.625 

0.645 

r 

0.944 

0.953 

0.912 

0.940 

Standard error of 
estimate 

0.296 

0.272 

0.324 

0.247 

F 

65.031 

77.600 

47.855 

57.491 

Relative contributions 
of steric:electrosatic 
fields 

0.890:0.110 

0.890:0.110 

0.918:0.082 

0.885:0.115 

Number of optimal 
components 

6 

6 

5 

5 


Table 5.13 PLS analyses for the CB1 and CB2 receptors using the CoMSIA models based 
on compound 12 as template (using two different conformers A and C). 


Figure 5.24 shows the steric and electrostatic CoMSIA contour maps of 12 and its corre- 
sponding CBD analogue 13 for the CB1 model using conformer C of AMG3 as template. 
For easy comparison, CoMSIA contour maps of CB1 model based on conformer A is also 
presented. Figure 5.25 presents the CoMSIA contour maps of 12 and its corresponding 
analogue 13 for the CB2 model using conformer C as a template ligand conformation. For 
easy comparison of contour plots, corresponding CoMSIA contour maps were also pre- 
sented based on template conformer A of 12. 
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Compound 


CB1 CoMSIA model 

pK; (observed) pK;(predicted) 


CB2 CoMSIA model 

pK, (observed) pK;(predicted) 


1 

7.02 

7.38 

7.14 

7.42 

2 

6.20 

5.93 

6.43 

6.29 

3 

6.92 

7.16 

7.29 

7.25 

4 

7.24 

7.11 

6.97 

7.17 

5 

7.93 

7.78 

8.03 

7.98 

6 

6.12 

6.15 

6.65 

6.77 

7 

7.55 

7.66 

7.60 

7.82 

8 

6.59 

6.41 

6.98 

7.01 

9 

8.08 

7.93 

8.41 

8.02 

10 

6.50 

6.33 

6.96 

6.88 

11 

6.77 

6.83 

6.99 

6.93 

12 

9.49 

9.45 

9.28 

9.21 

13 

6.87 

6.86 

7.30 

6.99 

14 

9.28 

8.99 

9.66 

9.01 

15 

7.24 

7.03 

6.59 

6.53 

16 

8.74 

8.80 

8.44 

8.79 

17 

7.49 

7.63 

7.71 

7.70 

18 

9.35 

9.61 

8.72 

9.25 

19 

7.32 

7.25 

7.41 

7.57 

20 

5.90 

5.88 

6.64 

6.29 

21 

7.66 

7.95 

- 

- 

22 

9.08 

9.20 

9.31 

9.19 

23 

9.36 

8.79 

9.07 

9.03 

24 

7.23 

7.03 

7.00 

7.35 

25 

8.90 

9.06 

9.54 

9.27 

26 

6.18 

6.85 

7.48 

7.42 

27 

9.15 

9.05 

8.99 

9.24 

28 

6.72 

6.81 

7.20 

7.36 

29 

7.66 

7.89 

7.08 

7.03 

30 

8.66 

8.39 

8.48 

8.32 


Table 5.14 Summary of experimental (observed) and second generation of CoMSIA 
model predicted pK; results of training set for the binding affinity at the CB1 and CB2 
receptors. 
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Experimental pKj 



Experimental pKi 




Figure 5.23 (top) Plots of corresponding 3D QSAR/CoMSIA predicted and experimental 
values of binding affinity (given as pKj) of CB analogues in the training set at the CB 1 re- 
ceptor for the constructed models based on conformers A (on the left) and C (on the right) 
of 12, respectively for CB1 receptor, (bottom) Plots of corresponding 3D QSAR/CoMSIA 
predicted and experimental values of binding affinity (given as pK;) of CB analogues in the 
training set at the CB2 receptor for the constructed models based on conformers A (on the 
left) and C (on the right) of 12, respectively for CB2 receptor. 
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Figure 5.24 (top) CoMSIA contour maps of 12 (on the left) and its corresponding CBD 
analogue 13 (on the right) for CB1 model. Conformer A is used as template, (bottom) 
CoMSIA contour maps of 12 (on the left) and its corresponding CBD analogue 13 (on the 
right) for CB1 model. Conformer C is used as template. 


Steric and electrostatic contour maps using template conformer C of 12 by 80:20 contribu- 
tion field ratio for the favored and disfavored fields were not enough to visualize these 
contour maps at the cyclic ring segment of the compound. However, increasing the disfa- 
vored level ratio (e.g., 70:30) leads to visible steroelectronic contour maps and it shows 
similar properties as contour maps of template conformer A of 12 at the cyclic ring seg- 
ment of the compound. For example, sterically unfavorable areas are located on the methyl 
or propenyl groups of CBD analogues, these unfavorable regions are located at the vicinity 

o 

of the tricyclic segment of A -THC analogues (Figure 5.26). Thus, contour maps confirm 

o 

higher affinities of A -THC analogues than their corresponding CBD analogues. 
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Figure 5.25 (top) CoMSIA contour maps of 12 (on the left) and its corresponding CBD 
analogue 13 (on the right) for CB2 model. Conformer A is used as template, (bottom) 
CoMSIA contour maps of 12 (on the left) and its corresponding CBD analogue 13 (on the 
right) for CB2 model. Conformer C is used as template. 



Figure 5.26 (top) CoMSIA steric contour maps of 12 (on the left) and its corresponding 
CBD analogue 13 (on the right) for CB2 model using 70:30 favored/disfavored field lev- 
els. Conformer C is used as template. 
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Three general conclusions could be drawn from the characteristics of derived 3D contour 
maps of CoMSIA models using both conformations of template ligand 12: 

1. Steric effects determine the binding affinity. The relative contributions of steric fields 
are larger than the electrostatic fields. 

2. The orientation of the C3-alkyl chain plays a crucial role in determining the biological 
activity. The green colored contours along the left side of the end of the alkyl chain 
(corresponding to shown snapshot contour plots, Figures 5.24 and 5.25) show that 
bulky groups enhance the binding affinity, whereas bulky groups in the right sides of 
the C3 -alkyl chain of analogues lead to decreased binding affinity. 

o 

3. Because of the structural differences of A -THC and CBD derivatives at the cyclic ring 
segment, these groups have different pharmacophoric requirements for their receptors 
in these regions. While sterically unfavorable areas are located on the methyl or pro- 
penyl groups of CBD analogues, these unfavorable regions are located at the vicinity of 

o 

the tricyclic segment of A -THC analogues (Figures 5.24 and 5.25). This explains why 

o 

usually A -THC analogues have higher binding affinities than their corresponding CBD 
analogues. 

The conformers A and C of 12 used as a template compound in CoMSIA analyses show 
similarities and differences in contour maps. Their similarities are reflected in the same 
regions that contour levels of identical color cover. However, close observation reveals 
significant differences in their shape and extent of covering of the contour regions. The 
conformational differences of conformers A and C are localized in the alkyl chain. Our 
results confirm the earlier literature reports that the lipophilic alkyl chain plays crucial role 
in determining cannabimimetic activity for the CB receptors. Thus, the differences of con- 
tour maps at alkyl chain are important for the interpretation of pharmacophore groups that 
affect the binding affinity. When conformer A is used as a template, both THC and CBD 
analogues have green colored contour (depicts sterically favorable groups) at the tail of 
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alkyl chain (Figure 5.24). However, if conformer C is used as a template compound, then 
at the tail of alkyl chain only THC fits green colored sterically favorable contour (left on 
Figure 5.25). CBD analogues do not fit green colored contours but they fit yellow colored 
contours (depicts sterically unfavorable groups) (right on Figure 5.25). These important 
observations are obtained only by the model that was constructed with conformer C of 12. 
The contour plots at the tail of alkyl chain which derived by the model that is constructed 
with conformer C of 12 demonstrates the better binding affinity of THC analogues than the 
corresponding CBD analogues. 

In addition, to validate the higher predictive ability of conformer C of the template ligand 

o 

12, 10 other A -THC analogues have been added to the training set and CoMSIA models 
have been reconstructed (binding affinities have been taken from reported values in the 
literature ' ). The same CoMSIA settings and PLS analyses have been performed for the 
re-constructed CoMSIA models. The same atoms in the template conformers of 12 have 
been selected for the structural superimposition processes. Results did not significantly 
modify the initially obtained models. Re-constructed 3D QSAR/CoMSIA models for the 
binding affinities at the CB1 and CB2 receptors have a very good cross-validated correla- 
tion (0.745 and 0.632, respectively for CB1 and CB2 receptors). Re -constructed models 
validate the initially obtained results: The model based on conformer C of 12 shows better 
statistical results than the model based on conformer A of 12. 

5.2.2.4 MD Simulations of AMG3 at the Active Site of Membrane-associated CB1 and 
CB2 Receptors 

MD simulations have been performed to the systems including AMG3 at the binding site 
of the CB1 and CB2 receptors merged with membrane bilayer in order to analyze the effect 
of critical amino acid residues at the active site of CB receptors to the conformational 
properties of ligands in a more realistic environment. Figure 5.27 shows a representative 
picture of used systems. For these simulations, docked poses of complexes that have high 
population were used as initial ligand-receptor complex coordinates. Torsional angle val- 
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ues of the alkyl side chain of AMG3 throughout the simulations were screened with trajec- 
tory analysis. Results showed that adopted conformations of AMG3 at CB1 and CB2 re- 
ceptors have different conformational properties. Torsional angles at the alkyl chain of 
AMG3 adopt trans for ii, x 3 , 15 and x 6 at the binding site of CB2 receptor with some fluc- 
tuations near this dihedral angle. Trajectory analysis of torsional angles x 3 and 14 show a 
propensity to be gauche-; X4 also adopts trans dihedral angles at the active site of the CB2 
receptor (Figure 5.28). Dihedral angle screening throughout the MD simulations results 
showed that x 3 , X4 and X6 mainly form a trans conformations at the binding site of the CB 1 
receptor. Dihedral angles x 3 and X5 are very flexible and adopt gauche± and trans confor- 
mations at the binding site of the CB1 receptor, however, gauche + /trans and trans/gauche- 
torsional angles are mainly observed for x 3 and X5, respectively (Figure 5.28). 

Therefore, five additional conformations (M, N, O, P and R) have been obtained from 
these simulations (Figure 5.29). Torsional angle screening results of MD analysis showed 
that conformers C, M, N and O of AMG3 favor at the active site of the CB 1 , and conform- 
ed P and R favor at the binding site of the CB2 receptor. 

Although the CB1 and CB2 receptors exhibit a very high sequence homology which rises 
to 68% in the TM regions, there are certain behavior differences of AMG3 conformers at 
the binding sites of receptors. One of the main differences between the MD simulations of 
ligand at the CB1 and CB2 receptors is the different behavior of the first dihedral angle xi 
of the alkyl side chain of AMG3. In the CB1 receptor, there is a high propensity of xi to 
establish a gauche- conformation, however in the CB2 receptor; it prefers to have a trans 
conformation. It is well-known that, different conformational re-arrangements of third and 
sixth TMs of GPCR determine the activation of CBs. In CB2 receptor, alkyl side chain of 
AMG3 conformers align parallel in the ligand recognition part of TM3, while in the CB1 
receptor they align perpendicular. This observation may help to understand the selectivity 
of CB ligands for the CB1 and CB2 receptors. 
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5.2.2.5 Third Generation of 3D QSAR Models Based on Conformational Analysis Re- 
sults of AMG3 by MD Simulations at the Active Site of the Receptor 

In order to test the effect of favored conformations of template compound (AMG3) at the 
binding site of the CB receptors to the QSAR models, initially all possible conformations 
have been tested as template conformer and 3D QSAR/CoMSIA models were re-obtained. 
Same common atoms in the template conformers of AMG3 with the initial models were 
selected for superimposition processes. Compounds in Table 5.1 and their measured activi- 
ties were used for the third generation of CB1 and CB2 models. Initial statistical tests by 
obtained models showed that optimal statistical results derived using template conformer 
O for CB1 and conformer P for CB2 models. Therefore, further analysis is performed us- 
ing these template conformers. Figure 5.30 shows the superimpositions of CB analogues 
used as a training set to construct 3D QSAR/CoMSIA models based on the conformers O 
and P of template ligand. Table 5.15 shows the derived statistical results for CB1 and CB2 
models (in order to easy comparison, results of first and second generation of models were 
also included in the Table). It is clearly shown that the using template conformer obtained 
from MD simulations of ligand at the binding site of the receptor improves statistical re- 
sults, significantly. 
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Figure 5.27 Representative picture of used system for MD simulations (ligand at the binding site of the receptor which is 
merged with lipid bilayer). 
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Figure 28. Torsional angle screening of alkyl side chain of AMG3 throughout the MD simulations at the binding site of (i) CB 1 
and (ii) CB2 receptors. 
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Figure 5.29 MD simulations of AMG3 at the active site of the receptors produced five additional conformations (M, N, O, P 
and R). At the active site of the CB1 receptor, conformers C, M, N and O; at the active site of the CB2 receptor, conformers P 
and R are found as favored conformations of AMG3. At the alkyl side chain these conformers have following dihedral angles: 

M (gauche +/trans/gauche+/trans/trans/trans); N ( gauche+/trans/gauche+/trans/gauche-/trans) ; O 

(gauche+/l rans/t rans/t rans/gauche-A rans)\ P (trans/gauche-AransAransAransArans) and R ( trans/gauche-Arans/gauche - 
AransArans). 
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Experimental and corresponding predicted affinities of CB analogues were presented 
at the Table 5.16 and plotted at Figure 5.31. 



Figure 5.30 Structural alignments of the compounds in the training set for construct- 
ing 3D QSAR/CoMSIA models based on template ligand 12. Superimposition was 
performed based on conformer O for CB1 model (on the left); and conformer P for 
CB2 model (on the right). 

Figure 5.32 shows the steric and electrostatic contour maps of 12 (on the left, top) and 
its corresponding CBD analogue 13 (on the right, top) for the CB1 receptor using as 
template ligand the conformer O; and corresponding stereoelectronic contour maps 
(compounds 12 (on the left, bottom) and 13 (on the right, bottom) for the CB2 recep- 
tor using as template ligand the conformer P. Similar contour plots with initially gen- 
erated models were obtained from the third generation of the models, however, the 
use of different template conformers for CB1 and CB2 models showed some fine and 
critical differences. For example, the orientation of alkyl chain of AMG3 is more re- 
stricted in the CB2 models, since left and right sides of the tail of the alkyl chain in- 
clude yellow colored sterically unfavorable contours, and between these contours 
there is a green colored sterically favorable contour (Figure 5.32, bottom). 
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CB1 model-first 
generation (tem- 
plate ligand 12- 
conformer A ) 

CBl model 
second generation 
(template ligand 12- 
conformer C) 

CBl model 
third generation 
(template ligand 12- 
conformer O) 

CB2 model-first 
generation 
(template ligand 12- 
conformer A) 

CB2 model 
second generation 
(template ligand 12- 
conformer C) 

CB2 model 
third generation 
(template ligand 12- 
conformer P) 

Number of com- 
pounds in the training 
set 

30 

30 

30 

29 

29 

29 

2 

r cv 

0.746 

0.764 

0.771 

0.625 

0.645 

0.710 

r 2 

0.944 

0.953 

0.962 

0.912 

0.940 

0.952 

Standard error of 
estimate 

0.296 

0.272 

0.244 

0.324 

0.247 

0.246 

F 

65.031 

77.600 

97.623 

47.855 

57.491 

72.373 

Relative contributions 
of steric:electrosatic 
fields 

0.890:0.110 

0.890:0.110 

0.852:0148 

0.918:0.082 

0.885:0.115 

0.853:0.147 

Number of optimal 
components 

6 

6 

6 

5 

5 

5 


Table 5.15 Derived statistical results of third generation (based on template conformation derived from MD simulations of template molecule at the 
membrane-associated CB1 and CB2 receptors) of CB1 and CB2 models. (In order to easy comparison, results of first and second generation of mod- 
els were also included in the Table. First generation of models was based on favored template conformation in solution, and second generation of the 
models was based on docking results). 
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Compound 


CB1 CoMSIA model 

pfCi (observed) pKj(pre dieted) 


CB2 CoMSIA model 

pK; (observed) p K ^predicted ) 


1 

7.02 

7.11 

7.14 

7.25 

2 

6.20 

6.00 

6.43 

6.48 

3 

6.92 

6.98 

7.29 

7.23 

4 

7.24 

6.96 

6.97 

7.08 

5 

7.93 

7.73 

8.03 

7.91 

6 

6.12 

6.55 

6.65 

6.43 

7 

7.55 

7.64 

7.60 

7.84 

8 

6.59 

6.67 

6.98 

6.99 

9 

8.08 

7.95 

8.41 

8.03 

10 

6.50 

6.68 

6.96 

6.97 

11 

6.77 

6.82 

6.99 

7.25 

12 

9.49 

9.23 

9.28 

9.03 

13 

6.87 

6.58 

7.30 

7.23 

14 

9.28 

9.56 

9.66 

9.56 

15 

7.24 

7.18 

6.59 

6.37 

16 

8.74 

8.81 

8.44 

8.51 

17 

7.49 

7.85 

7.71 

7.79 

18 

9.35 

9.37 

8.72 

9.28 

19 

7.32 

7.15 

7.41 

7.46 

20 

5.90 

5.82 

6.64 

6.48 

21 

7.66 

7.94 

- 

- 

22 

9.08 

9.14 

9.31 

9.35 

23 

9.36 

9.33 

9.07 

9.10 

24 

7.23 

6.98 

7.00 

7.05 

25 

8.90 

8.87 

9.54 

9.13 

26 

6.18 

6.30 

7.48 

7.33 

27 

9.15 

8.90 

8.99 

9.10 

28 

6.72 

6.61 

7.20 

7.24 

29 

7.66 

7.77 

7.08 

7.38 

30 

8.66 

8.75 

8.48 

8.45 


Table 5.16 Summary of experimental (observed) and CoMSIA-predicted pK; results 
of training set by third generation of the models for the binding affinity at the CB1 
and CB2 receptors. 


Obtained contour plots also merged with the receptor coordinate files (Figure 5.33, 
top). Results confirmed the accuracy of obtained contour plots by third generation of 
QSAR models. For example, sterically unfavorable contours of CB1 model fit with 
the side chains of the amino acid residues at the binding site of the CB1 receptor (e.g., 
Leul90, Argl86, Trp279, Tyr275, Tyr355 and Val364); (Figure 5.33, bottom). 
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Figure 5.31 Plots of experimental and predicted values of binding affinities (given as 
pKj) of CB analogues in the training set obtained by third generation of 3D QSAR 
(left) CB 1 model and (right) CB2 model. 



Figure 5.32 Steric and electrostatic contour maps of 12 (on the left, top) and its corre- 
sponding CBD analogue 13 (on the right, top) for the CB1 receptor using the template 
ligand as conformer O; and corresponding stereoelectronic contour maps (compounds 
12 (on the left, bottom) and 13 (on the right, bottom) for the CB2 receptor using the 
template ligand as conformer P. 
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Figure 5.33 (top) Contour plots of CB1 model merged with CB1 receptor, (bottom) 
Close-look to the contour maps at the binding site of the CB 1 receptor. 

5.2.3 De Novo Drug Design Studies of CB Analogues 

The optimal derived PLS analyses of CB analogues, which are produced from third 
generation of QSAR models, were used to generate each site points for CB1 and CB2 
models. These site points were used in the de novo drug discovery program LeapFrog, 


130 





for the predictions of novel hits by repeatedly making structural changes and then ei- 
ther keeping or discarding the results. Since third generation of CB1 and CB2 models 
were derived based on different conformations of template conformers, a selectivity 
of new generated ligand structures by LeapFrog can be expected. The basic informa- 
tion about LeapFrog was explained in the “Computational Details” chapter of the the- 
sis. The new ligand structures were evaluated on their binding energies, and structures 
that have better binding energy than reference compound (template compound, 
AMG3) were collected. Tables 5.17 and 5.18 show these structures for CB1 and CB2 
receptors, respectively. Predicted binding affinities based on derived QSAR models 
were also included in the tables. Derived molecules that have highest predicted bind- 
ing affinity for CB1 and CB2 receptors (Dl and D12, Tables 5.17 and 5.18) were 
docked at the binding site of the CB1 and CB2 receptors, respectively (Figures 5.33 
and 5.34). Their predicted high binding affinities confirmed by better docking scores 
than AMG3 (i.e., Dl and D12 have binding scores of -19.11 kJ/mol and -19.23 
kJ/mol, respectively). The Dl stabilizes its interactions with the binding site forming 
H-bonds with the amino acid residues (e.g., Argl86, Thrl97 and Pro251) of CB1 as 
well as van der Waals interactions with the non-polar surfaces of the active site resi- 
dues of CB1 receptor (e.g., Thrl97, Phe200, Thr201, Ile247, Leu250, Pro251, 
Tyr275, Trp279, Thr283, Leu360). The D12 stabilizes its interactions with the bind- 
ing site forming H-bonds with the amino acid residues (e.g., Leul60, Leul63, Serl65, 
Tyrl66, Leul67, Prol68) as well as van der Waals interactions with the non-polar 
surfaces of the binding site residues of CB2 receptor (e.g., Lysl09, Leu 160, Leu 163, 
Pro 168, Pro 187, Tyrl90, Trpl94, Trp258). 

5.2.4 Homology Modeling Studies of CB Receptors 

If the X-ray structure of a ligand-bound receptor is not available, homology models of 
the protein can be used to obtain the ligand binding cavities. The steroelectronic prop- 
erties of these cavities are directly related to the performed molecular model coordi- 
nates. Thus, the use of different template structures for homology may result in varia- 
tion of ligand binding modes. In order to validate the obtained results using CB1 and 
CB2 receptor models based on bovine rhodopsin (1F88, pdb code), alternative CB1 
and CB2 comparative models have been attempted employing template structure of 
(32-adrenergic receptor. The initial structure was taken from the cholesterol bound 
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form of human [12-adrenergic receptor (pdb code, 3D4S). 127 The water molecules and 
the cholesterol were removed from the system. Sequence alignment has been obtained 
with Biopolymer module of Sybyl. 27 CB1 and CB2 receptors show 28% and 24% se- 
quence identity, respectively (Figure 5.35). Initial geometry optimization calculations 
have been carried out with Powell algorithm using Tripos force field. 27 Subsequently, 
these receptors have been subjected to 2 ns MD simulations using Gromacs. 

Before the simulations, geometry optimization of receptors has been performed with- 
out constrains using steepest descent integrator for 10000 steps with the minimization 
tolerance of 100 kJ/(mol.nm). Cluster analysis of obtained coordinate file of trajecto- 
ries has been performed with g_cluster module of Gromacs and each simulation 
yielded 8 clusters. Sequence alignment of representative of these clusters with the 
rhodopsin based receptor models has been performed with Accelrys DS 2.0 pro- 
gram 128 and models that have smallest RMSD values (using C“ atoms as the reference 
points) for CB1 and CB2 were used for further analysis. Obviously there are some 
fine differences between models based on rhodopsin and p2-adrenergic receptor for 
each receptor; however, the motifs of the seven member TM helices have been kept. It 
should be noted that the active site residues have smaller RMSD values (< 3A) than 
the other amino acid residues. For clarity, superimposition of rhodopsin and [12- 
adrenergic based receptor models of CB1 has been shown at Figure 5.36. In Figure 
5.37, the ligand binding pockets of CB1 have been shown with novel obtained model 
based on [12-adrenergic receptor; figure clearly confirms the two obtained binding 
pockets from the previous model based on rhodopsin. 
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Figure 5.33 Binding interactions of D1 at the binding site of the CB 1 receptor. 



Figure 5.34 Binding interactions of D12 at the binding site of the CB2 receptor. 
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Figure 5.35 Sequence alignment of CB receptors based on p2-adrenergic receptor. 



Figure 5.36 Superimposition of rhodopsin (yellow colored) and p2-adrenergic (cyan 
colored) based CB1 receptor models from both side and top views. 
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Figure 5.37 Two binding pockets (S 1 and S2) were identified at the rhodopsin based 
CB1 model (left). CB1 receptor model obtained by using template of human p2- 
adrenergic receptor confirmed these two ligand binding pockets between the TM3- 
TM6 (right). 
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Chapter 6. Computational Design of Novel Fullerene 
Analogues as Potential HIV-1 PR Inhibitors: Analysis 
of the Binding Interactions between Fullerene Inhibi- 
tors and HIV-1 PR Residues Using 3D QSAR, Mo- 
lecular Docking and MD Simulations 
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6. 1 Introduction 


Inhibition of HIV-1 PR leads to the production of immature virus particles and pre- 
vention of further rounds of infection. 129,130 HIV-1 PR is an important target enzyme 
for anti-acquired immunodeficiency syndrome (AIDS) drug design as its inhibition 
leads to the production of non-infectious viral particles. 129 In the last few years, many 
potent and selective HIV-1 PR inhibitors have been developed and approved as drugs 
for the inactivation of this enzyme by the Food and Drug Administration, while sev- 
eral others are under clinical investigation. 129 ' 131 Since the nature of most of these 
drugs is peptide-like, their oral bioavailability and half-life are limited. 132 Experimen- 
tal findings indicate the rapid emergence of drug resistance to most of the HIV-1 PR 
inhibitors, because site specific mutations in the enzyme occur at one or more resi- 
dues. 133 These mutations are conservative and involve a similar set of amino acid resi- 
dues in response to exposure to different inhibitors, thus giving rise to cross resis- 
tance. 133 Therefore, several research groups tried to develop non-peptidic HIV-1 PR 
inhibitors, which may block the mutations responsible for this resistance. 11 ' 134 

In the last decade, fullerene and its derivatives have been extensively investigated for 
biomedical applications. Inhibition of HIV- 1 PR by fullerene analogues demonstrated 
by Friedman et al. n and complexation of HIV-1 PR with fullerene compounds has 
been supported by molecular modeling studies. 1 '" These studies showed that the 
fullerene can be perfectly accommodated inside the binding pocket of HIV-1 PR 
(Figure 6.1). The active site of the HIV-1 PR is approximately an open-ended cylin- 
drical hydrophobic cavity with 10 A diameter composed of catalytic aspartic acid 
residues Asp25 and Asp25'. n ' 13 The complementary spatial relationship between 
[60]fullerene and the active site of the HIV-1 PR enzyme has led to the suggestion 
that fullerene-based derivatives could have potential use as effective HIV-1 PR inhibi- 
tors. 11 ' 1 '" The binding interactions of [60]fullerene derivatives in the active site of the 
HIV-1 PR have been examined through a combination of several molecular modeling 
techniques. 13 ' 134 Kinetic analysis of the HIV-1 PR enzyme in the presence of various 
water-soluble fullerene derivatives suggests a competitive mode of action. 13 This is 
attributed to the ability of fullerene derivatives to form bonds with the catalytic site 
and the van der Waals interactions with the non-polar HIV-1 PR surface thereby, im- 
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proving the binding. lj4 Since the binding affinity values of “first generation” fullerene 
inhibitors were not significant (EC50 ~10" 6 M), further structural investigation is re- 
quired in order to propose new HIV-1 PR/fullerene complexes with better binding af- 
finity. For this aim, a set of synthetically reported fullerene derivatives (Table 
6 .1)135,136 | iave been usec j construct models with the 3D QSAR methodologies: 
C 0 MFA 9 and C 0 MSIA 10 . 



Figure 6.1 Perfect fit of a fullerene derivative at the active site of the HIV-1 PR (ac- 
tive site residues have been shown as molecular surface for clarity). 

6.2 Results and Discussion 

In order to improve the structure alignment and derive statistically reliable models, 
initially, the most potent fullerene analogue in the data base (compound 1, Table 6.1) 
has been docked in the binding cavity of 3D model of HIV-1 PR (hitherto X-ray 
structure of a fullerene ligand-bound HIV-1 PR is not available, thus, HIV-1 PR struc- 
ture has been received from the X-ray structure of haloperidol-bound HIV-1 PR (pdb 
code, 1AID) 35 . The derived best docked complex structure has been subjected to MD 
simulations, in order to stabilize the localization of fullerene at the HIV-1 PR. 3D- 
QSAR/CoMFA and CoMSIA methods were applied to the data set, which was di- 
vided into training and test sets. To our knowledge, this study is the first 3D QSAR 
application to the fullerene based compounds. Both CoMFA and CoMSIA studies 
gave similar results indicating that the steric effects are essential for the activity; 
which is reasonable because of high non-polar property of compounds at the data set. 
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Table 6.1 Chemical structures, measured affinities and binding energies and calcu- 
lated binding scores of fullerene derivatives. 

After obtaining the optimum position of most potent fullerene analogue 1 inside the 
HIV-1 PR, 3D QSAR/CoMFA and CoMSIA methods have been performed. The 
logarithmic I/EC 50 values (pECso) were used in the 3D QSAR correlations, as they 
are related to changes in the free energy of binding. Several variations in the align- 
ment procedures are considered by superimposing similar pharmacophoric features. 
Highlighted carbon atoms (32 central carbon atoms of fullerene) for the template 
ligand 1 are selected for the structural superimposition processes (Figure 6.2i). The 
alignment of the molecules was based on atom-by-atom superimposition of selected 
atoms, which are common in all compounds. Figure 6.2ii illustrates the superimposi- 
tion of the molecules in the training set. Compounds 5, 16 and 17 have been used as a 
test set. They possess higher, similar, and lower inhibition effects than average pECso 
values of the studied compounds which represents the whole data set. The cross vali- 
dated PLS method was then subjected to the training set. Table 6.2 summarizes the 
statistical results. 
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(i) (ii) 


Figure 6.2 (i) Selected atoms of the template compound 1 for structural superimposi- 
tions of the compounds in training set. (ii) Structural alignments of the compounds in 
the training set for constructing 3D QSAR/CoMFA and CoMSIA models at HIV-1 
PR receptor. 



CoMFA 

CoMSIA 

r cv 

0.549 

0.555 

r 2 

0.994 

0.997 

Standard error of estimate 

0.097 

0.051 

F 

509.545 

1216.442 

Relative contributions of 
steric/electrosatic fields 

0.776:0.224 

0.872:0.128 

Number of optimal 
components 

4 

5 


Table 6.2 Cross-validated analyses using CoMFA and CoMSIA methodologies, 
based on template compound 1. 
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3D QSAR/CoMFA and CoMSIA studies gave r c 2 v values of 0.549 and 0.555, respec- 
tively. The non-cross validated PLS analysis yielded r 2 values of 0.994 and 0.997, re- 
spectively. Figure 6.3 shows the relationship between the experimental and predicted 
pECso values of cross-validated PLS analysis for CoMFA and CoMSIA. The signifi- 
cance of the proposed models is verified by the good predictions of the activity of 
compounds belonging to the test set. Test set compounds 16 and 17 both in CoMFA 
and CoMSIA gave good prediction results (Table 6.3). Compound 5 is underestimated 
by CoMFA about 1.0 unit but CoMSIA result gave satisfactory estimation of its activ- 
ity (error on prediction is less than a unit). 


Compound 

pECso (observed) 

CoMFA 

pEC5o(predicted) 

CoMSIA 

PEC50 (predicted) 

1 

7.00 

7.05 

6.95 

2 

5.30 

5.25 

5.31 

3 

6.80 

6.65 

6.84 

4 

5.13 

5.22 

5.11 

5 

6.31 

5.22 

5.76 

6 

2.89 

2.84 

2.97 

7 

4.12 

4.03 

4.08 

8 

3.64 

3.69 

3.56 

9 

3.85 

3.92 

3.80 

10 

5.60 

5.52 

5.62 

11 

6.05 

5.96 

5.94 

12 

5.53 

5.59 

5.61 

13 

5.66 

5.83 

5.66 

14 

5.20 

5.14 

5.16 

15 

5.14 

5.23 

5.23 

16 

4.66 

4.68 

4.67 

17 

3.86 

3.46 

3.67 

18 

4.75 

4.64 

4.73 

19 

4.14 

4.17 

4.15 

20 

5.11 

5.17 

5.18 


Table 6.3 Summary of experimental (observed) versus CoMFA and CoMSIA- 
predicted pECso results for the binding affinity of fullerene derivatives at the HIV-1 
PR. 


The CoMFA and CoMSIA contour maps were used in order to visualize the steroelec- 
tronic requirements of the binding cavity of HIV-1 PR. Figures 6.4 and 6.5 show the 
steric-electrostatic contour maps of the CoMFA and CoMSIA models (for the com- 
pounds that show the best and worst inhibition effects within the data set for the HIV- 
1 PR receptor, compounds 1 and 6 in Table 6.1 , respectively). 


147 




CoMSIA 



Experimental pEC 50 


Figure 6.3 Plots of observed and 3D QSAR/CoMFA (left) and CoMSIA-predicted 
(right) binding affinities of fullerene analogues in the training set at the HIV-1 PR. 


Template compound 1 has better inhibition effect than 6, around 10 4 -fold. This can be 
explained by different topographical requirements of 1 and 6; both aromatic rings of 
1, fit very well with the green colored contour map which shows sterically favorable 
places, there are no bulky groups of 6 in this field (Figure 6.4). In addition, inactive 
compound 6 has a flexible chain and fits with the yellow colored contour which 
shows sterically unfavorable areas at the CoMSIA model (Figure 6.4). Moreover, 
electrostatic contour maps of 1 and 6 clearly show the differences both in CoMFA and 
CoMSIA models. For example, the -OH group of 1 fits on red colored contour (nega- 
tive charged favored) while, -C=0 group of 6 fits on blue colored contour (positive 
charged favored), (Figure 6.4). Relative contributions of steric and electrostatic fields 
obtained by CoMFA and CoMSIA are 0.78:0.22 and 0.87:0.13, respectively. 



Figure 6.4 (Left) CoMFA contour maps of template compound 1 (shown in top-left 
and bottom-left which has highest binding affinity in the training set) and compound 6 
(shown in top-right and bottom-right which has lowest binding affinity in the training 
set). (Right) Corresponding CoMSIA contour maps for compounds 1 and 6. 


148 



Hitherto, a lot of fullerene derivatives have been synthesized by several experimental 
groups but a limited number of them have been subjected to bioactivity test against 
HIV-1 PR. Obviously, when the number of compounds increases in the data set, the 
stability of the constructed model is expected to be increased. However, in this par- 
ticular case, it is difficult to increase the number of compounds in the data set because 
the experimentally examined fullerene compounds for HIV-1 PR inhibition are lim- 
ited and measured ones do not show very diverse binding affinities. For this reason, 
an attempt has been performed in QSAR studies: The binding energy results of bio- 
logically evaluated fullerenes in HIV-1 PR cavity was examined with docking simula- 
tions and a correlation between experimental versus theoretical values has been found. 
Thus, it is proposed to enlarge the data set including the structures from computation- 
ally designed compounds and their activities from the docking simulations to con- 
struct QSAR models. This idea may be used by other research groups if the experi- 
mental data are limited and/or not diverse, the coordinate file of the receptor (using X- 
ray data or homology model) is available and the active site of the receptor is well de- 
fined. 

The large variations in binding affinities of the designed fullerene inhibitors with 
HIV-1 PR and the relations between biological activity and the flap motion of the en- 
zyme, as well as, the connection between the biological activity and the conforma- 
tional changes in the catalytic site of the HIV-1 PR, were also analyzed. The follow- 
ing three steps in our computational design strategy have been studied: (i) Initially, 
several monoadducts and bisadducts of [60]fullerene have been designed with some 
modifications of reported structures in the literature (Table 6.1) in order to explore 
more conformational space. Most of the designed fullerene structures include 1,3- 
cyclohexadiene derivatives. The experimental methodology to synthesize this kind of 
fullerene derivatives was reported by An et a l." 1 (ii) In order to diminish the stability 
problems, CoMSIA models with smoother potential functions have been used in fur- 
ther 3D QSAR investigations of the fullerenes. Novel monoadducts and bisadducts of 
[60]fullerene have been designed with the aid of 3D QSAR/CoMSIA models and their 
binding affinities at the HIV-1 PR have been tested employing molecular docking. In 
order to use proper input coordinates of HIV-1 PR/fullerene derivative complex in the 
docking simulations, MD simulations were employed by Gromacs program. 36 In order 
to understand the effect of fullerene ligand to the conformational changes of amino 
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acid residues of active site of HIV-1 PR, both ligand-free and ligand-bound systems 
were used in the simulations (Figure 6.5). The flexibility of flap region of HIV-1 PR 
has been discussed by both experimental and computational studies. 1 " 8 ' 1 " 9 The free 
protease adopts closed, semi-open or fully open forms in a dynamic equilibrium. 
However, the closed form is favored when the inhibitor is bound in the cavity of the 
reaction site. 138 Friedman et a/. 11,13 proposed that there is a direct correlation between 
inhibitory of a compound and the amount of hydrophobic surface area that it can 
desolvate. The MD simulations by Zhu et al. 140 showed the exclusion of water near 
the flap regions in order to accommodate the fullerene inhibitor. The decreasing of the 
water density in the cavity leads to the enhancement of the hydrophobic interaction 
between the fullerene derivative and the active site of the enzyme, (iii) The Leapfrog 
de novo drug design program and 3D QSAR/CoMSIA contour maps have been used 
in order to generate a series of potent fullerene based HIV-1 PR inhibitors. This ap- 
proach is fruitful and can be of general use because: (a) it minimizes the compounds 
to be synthesized; (b) it leads a rational design for the subsequent compounds to be 
synthesized and (c) it saves time and reduces the cost and human effort. 

The inhibition effect of limited monoadducts and bisadducts of [60]fullerene has been 
biologically evaluated. Table 6.1 shows the biological activities of 20 fullerene de- 

IT ITS 1TA 1 AT, 

rivatives reported in the literature. ’ ’ ’ In order to explore more conformational 

space and include more diverse binding affinities in the data set, computationally de- 
signed fullerene analogues (Table 6.4) have been analyzed using docking studies. The 
formula AG = -RTlnK, was used to convert the FlexX binding energies to estimated 
binding affinities. Since the experimental binding activities of most of the derivatives, 
which are of interest in this study only reported as median effective concentration 
(EC50), these values are assumed to be equal with K, in the calculations of the free 
binding energies. A similar working hypothesis has been used by Naik et al . 141 and 
Conn et al. 142 The reported K; and EC50 values for compound 4 are 5.3 pM and 7.3 
pM, respectively. Assuming that K, and EC50 are identical, the resulting error for AG 
of compound 4 is 2.6 %. 


150 








42 
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n ,NH./\ /V V»° 
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0 nh 2 ^^ A» 

H 2 N 0 
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0.04 
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0.18 

44 


- 32.2 

2.47 

45 

H HH \ \_/ / H H 

- 23.7 

75 

46 

O^OH ° 

- 40.2 

0.10 

47 

/^V-TN, oh 

VVTy' 0H 

oh 

- 35.3 

0.71 

48 

7*. HQ n 

\ /V V° 

H ° 1\ \ / /Y^n 

OH Hcf o 

- 30.0 

5.98 

49 

0 

\^nh 2 

0 

- 34.9 

0.84 


Table 6.4 Computationally designed fullerene derivatives, their binding scores and 
estimated binding affinities. 

A deeper understanding of the mechanistic events associated with HIV-1 PR binding 
is important for the design of new inhibitors with enhanced activity. In the docking 
calculations the FlexX docking algorithm was used which considers docking to a rigid 
protein. It is well known that, the conformations of binding pocket residues of a pro- 
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tein affect the binding score. The coordinates of active site residues of fullerene bound 
HIV-1 PR maybe different than the coordinate of haloperidol derivative bound HIV-1 
PR. Hence, the correct binding mode of the studied inhibitors can best be obtained 
from the MD simulations of HIV- 1 PR/fullerene derivative complex. The most potent 
reported compound in Table 6.1 (compound 1) was docked in the binding cavity of 
HIV-1 PR and docking has been employed using the FlexX docking program. The 
coordinate file of the best FlexX molecular docking pose (which is associated with the 
biggest -in absolute value- binding energy) is used in the MD simulations of 
fullerene-bound HIV-1 PR. The average coordinate file of HIV- 1 PR from the final 1 
ns trajectory files of MD simulations of complex system has been used in the re- 
docking calculations. 



Figure 6.5 Two different systems were used in MD simulations: (i) A ligand-free rec- 
tangular box with HIV-1 PR and solvent (water) molecules, (left on the figure), and 
(ii) the rectangular box includes fullerene analogue at the binding site of the enzyme 
and solvent (water) molecules, (right on the figure). 

HIV-1 PR is a C2 symmetric homo dimer with a large substrate binding pocket cov- 
ered by two glycine-rich [5-hairpins, or flaps. 139 Consistent structural differences have 
been found between the free and bound systems of the HIV-1 PR (average structures 
from simulation have been considered; in Figure 6.6, initial forms have been shown 
with turquoise color and average structures from MD simulations have been shown 
with red color). In the fullerene inhibitor bound forms, the flaps are pulled in toward 
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the dual Asp25 catalytic site (the closed form), while the structure for the free HIV-1 
PR adopts a semi-open form with flaps shifted away from the catalytic site, however, 
still substantially closed to the active site. The relative orientation of the p-hairpin 
flaps is reversed in the two forms (Figure 6.6). The results are in agreement with the 
MD simulations of non-fullerene based HIV-1 PR inhibitor. 139 

Changes of the HIV-1 PR binding cavity for the free and fullerene-bound systems 
were screened by employing the change of distance between C“ atoms of the two cata- 
lytic residues, Asp25 and Asp25', as well as, the change of distance between C“ atoms 
of the residues at the flap region (Gly48, Gly49, Ile50, Gly51 and Gly52 amino acid 
residues in chain A and their corresponding residues in chain B). In the ligand-free 
system, only some perturbations around the initial distance have been found for the 
distance between C“ atoms of Asp25 and Asp25'; Gly48 and Gly48'; Gly51 and 
Gly51'; Asp25' and Gly49'; Gly52 and Gly52', however, distances between the C“ at- 
oms have been found increased ~4.0 A between Asp25 and Gly49; ~ 1.0 A between 
Gly49 and Gly49’; and ~0.5 A between Ile50 and Ile50' through the simulations (Fig- 
ure 6.7i). In the fullerene-bound system a detectable decrease in distances has been 
observed during the simulation with some perturbations at certain intervals (Figure 
6.7ii). For example, distances between C“ atoms have been decreased -1.5 A between 
Asp25 and Asp25'; -0.5 A between Gly49 and Gly49'; -2.0 A between Asp25' and 
Gly49; -2.0 A between Gly48 and Gly48'. Distances between Ile50 and Ile50'; Gly51 
and Gly51'; Gly52 and Gly52' did not change significantly throughout the simulations 
(Figure 6.7ii). A small increase in the distance (-1.0 A) has been observed between 
the C“ atoms of the amino acids Asp25 and Gly49. Therefore, fullerene inhibitor tends 
to keep HIV-1 PR cavity in a closed form. 

The difference of behavior at the flap region for these systems leads to two different 
orientations. It has been observed that flap loop of chain A lies perpendicular to that 
of chain B in both the fullerene bound and ligand free systems. However, in contrast 
to the ligand free system, chains orient toward the catalytic dual Asp25 residues in 
fullerene bound system. 
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(ii) 



Semi-open form Closed form 


Figure 6.6 (i) In the fullerene inhibitor-bound forms, the flaps are pulled in toward 
the bottom of active site (the closed form), while the structure for the free HIV-1 PR 
adopts a semi-open conformation with flaps shifted away from the dual Asp25 cata- 
lytic site, however still substantially close to the active site. In figure, initial forms are 
shown with turquoise color and average structures of simulation are shown with red 
colors. For clarity, only heavy atoms of the fullerene have been shown in the average 
structure of the simulation; (ii) A side-view of binding cavity of HIV-1 PR (from av- 
erage structures of MD simulations of free (left) and inhibitor bound (right) systems) 
illustrates the semi-open and closed forms. 
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The torsional angles (OD2-C 7 -C l ’-C a ) of the catalytic dyad (Fig. 6.8i) were also inves- 
tigated for the ligand-free and ligand-bound systems. The trajectory analysis of the 
defined torsional angles of Asp25 and Asp25' for ligand free HIV-1 PR have shown 
that they keep their value mainly at -1407-300° and -1007-300°, respectively. (Fig- 
ure 6.8ii). These values are -200° for Asp25 and -280° for Asp25' in fullerene-bound 
HIV-1 PR (Fig. 6.8iii). 

Atom positions between the crystal (1AID) and the average structure from MD have 
been compared for both ligand-free and ligand-bound systems in order to understand 
which parts of the receptor are more stable and which are floppy. For this purpose, a 
script with color scale has been used under VMD program (version 1.8.6) 40 where 
blue colored places show no change in distance (stable) and red colored places show 
floppy areas (Figure 6.9). As it is clearly shown, flap regions, catalytic part, and ter- 
mino-lateral fields of HIV- 1 PR at both ligand-free and fullerene-bound systems show 
more flexible conformations throughout the simulations. 

After obtaining the average structure of HIV- 1 PR from MD simulations, computed 
binding scores of the synthetic derived fullerene analogues (Table 6.1) from docking 
simulations were compared with their corresponding experimental results (Table 6.1). 
The linearity of the plot (r 2 =0.69, N=19, compound 20 used as outlier) shows a good 
correlation between computed binding scores and experimental binding energies, 
(Figure 6.10). 

Tables 6.1 and 6.4 show the list of fullerene derivatives used in molecular docking 
studies, their binding energies and their estimated binding affinities from computed 
binding energies. In order to design novel fullerene derivatives with high inhibition 
effect of HIV-1 PR, 3D QSAR/CoMSIA models were employed. As it is mentioned 
above, binding affinity results from experimental measurements of fullerene-based 
inhibitors at the HIV-1 PR are limited. Since experimental and computed binding en- 
ergies showed a good correlation, both experimental binding affinities of structures at 
Tables 6.1 and estimated binding affinities at Table 6.4 have been used to form 3D 
QSAR models. The used structures have a wide variation of biological activity (-Re- 
fold variances in binding affinity). 43 molecules were used as the training set and the 
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rest 6 derivatives (compound numbers: 16 , 27 , 34 , 39 , 43 and 48 , Tables 6.1 and 6.4) 
were used as test set at the CoMSIA analysis. The test set includes compounds repre- 
senting all categories of activity of the training set, e.g., inactive and active com- 
pounds comprising all structural features that are important for the activity. Among 
the analogues at Tables 6.1 and 6.4, compound 23 was selected as a template, because 
it has the highest binding affinity at the HIV-1 PR in the training set. Several varia- 
tions in the alignment schemes are considered by superimposing the similar pharma- 
cophoric features. Highlighted carbon atoms (32 central carbon atoms of fullerene) for 
the template ligand 23 are selected for the structural superimposition processes (Fig- 
ure 6.11). The alignment of the molecules was based on atom-by-atom superimposi- 
tion of selected atoms, which are common in all compounds. Figure 6.12 illustrates 
the superimposition of fullerene analogues used as the training set to construct CoM- 
SIA models. The cross validated PLS method was then applied to the training set. Ta- 
ble 6.5 summarizes the statistical results. 

Five different combinations of steroelectronic fields of 3D QSAR/CoMSIA models 
were obtained from the set of biologically evaluated and computationally designed 
fullerene derivatives (training set=43 , test set=6) in order to predict novel compounds 
with improved inhibition effect. The best 3D QSAR/CoMSIA model (CoMSIA-4: 
steric/electrostatic/H-bond donor/H-bond acceptor) yielded a cross validated r 2 value 
of 0.739 and a non-cross validated r 2 value of 0.993. Using the neutral and ionized 
states of the fullerene analogues in data set did not affect significantly the constructed 
QSAR models. Thus, neutral states of fullerene derivatives were used in the further 
analyses. The selected 3D-QSAR/CoMSIA model (CoMSIA-4) for the estimated 
binding affinities of fullerene analogues at the HIV-1 PR has a good cross validated 
correlation. Figure 6.13 shows the relationship between the estimated pECso values 
from FlexX binding energies and CoMSIA-predicted results of the non-cross- 
validated analyses for the HIV-1 PR. Linearity of the plot shows very good correla- 
tion for the CoMSIA model. Table 6.6 summarizes the estimated binding affinities 
from molecular docking results and CoMSIA-predicted pECso results for the fullerene 
derivatives at the HIV-1 PR. A good correlation was observed in CoMSIA of the 
fullerene derivatives as it is demonstrated by the very high values of r 2 . Additionally, 
the credibility of the models is evidenced by the high values of r cv 2 . However, the real 
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(ii) 
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Figure 6.7 (i) Change of the HIV-1 PR binding cavity for the ligand-free and; (ii) fullerene-bound systems were evaluated in terms of distance be- 
tween the catalytic residues, Asp25 and Asp25' as well as distance between the residues of the flap region (Gly48, Gly49, Ile50, Gly51 and Gly52) 
were screened throughout the MD simulations. 
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Figure 6.8 (i) The torsional angle of (OD2-C y -C |3 -C a ) of the catalytic dyad for the (ii) 
ligand-free and the (iii) ligand-bound systems throughout the MD simulation. 
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Figure 6.9 Atom positions between the crystal (1AID) and the average structure from 
MD simulations have been compared for both systems ligand-free and inhibitor- 
bound systems in order to understand which parts of the enzyme are more stable and 
which are floppy. Blue colored fields show no change in displacement, red colored 
fields show high flexible regions. 




Figure 6.10 Plot of experimental and computed binding energies for the reported 
fullerene analogues. 

significance of the proposed model is verified by the good predictions of the activity 
of compounds belonging to the test set (Table 6.7). Their pECso values of these com- 
pounds ranges between 4.04 and 6.74 and their biological activities were predicted 
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from the PLS equations derived from CoMSIA-4 model. All compounds showed pre- 
dicted values within one logarithmic unit difference from the estimated pECso values. 



Figure 6.11 Selected atoms of the template compound 23 for structural superimposi- 
tions of the compounds in training set. 



Figure 6.12 Structural alignments of the compounds in the training set for construct- 
ing 3D-QSAR/CoMSIA model at HIV-1 PR. 

The contour maps were used to create a matrix in the place of the active site and 
variations of the used ligands can be generated as long as they fit better into the bind- 
ing site. Figure 6.14i shows the steric-electrostatic contour maps of the CoMSIA 
models for the compounds 23 that shows the highest and 36 that shows the lowest in- 
hibition effects within the data set for the HIV-1 PR enzyme. In addition, H-bond do- 
nor and H-bond acceptor contour maps are shown in Figure 6.14ii. The individual 
contributions from the H-bond donor and H-bond acceptor favored and disfavored 
levels are fixed at 80% and 20%, respectively. The contours for H-bond donor fa- 
vored fields have been shown in cyan color while its disfavored fields have been 
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CoMSIAl 

(STR/ES) 

CoMSIA2 

(STR/ES/ACC) 

CoMSIA3 

(STR/ES/DON) 

CoMSIA4 

(STR/ES/ACC/DON) 

CoMSIA5 

(STR/ES/ACC/ 

DON/HYD) 

r 2 

*cv 

0.616 

0.733 

0.630 

0.739 

0.670 

r 2 

0.970 

0.991 

0.985 

0.993 

0.993 

Components 

6 

6 

6 

6 

6 

F 

191.713 

632.364 

392.176 

824.144 

861.108 

Std. Err. 
Rel. Contr. 

0.266 

0.148 

0.188 

0.130 

0.127 

Steric 

0.707 

0.512 

0.551 

0.426 

0.243 

Electrostatic 

0.293 

0.143 

0.197 

0.127 

0.086 

Hydrophobic 

- 

- 

0.252 

- 

0.369 

H-bond donor 

- 

- 

- 

0.167 

0.109 

H-bond acceptor 

- 

0.345 

- 

0.280 

0.193 


Table 6.5 Summary of statistical results of the derived CoMSIA models for the train- 
ing set. (Abbreviations: STR, Steric; ES, Electrostatic; ACC, H-bond acceptor; DON, H-bond donor; 
HYD, Hydrophobic). 



Figure 6.13 Plot of measured and CoMSIA-predicted binding affinities (given as 
pECso) of fullerene analogues in the training set at the HIV-1 PR. 


shown in purple color. H-bond acceptor favored fields have been shown in orange 
color while its disfavored fields have been shown in white color. Derived 3D contour 
maps of CoMSIA models are investigated in the binding cavity of the HIV-1 PR. 
Contour plots confirmed the stability of the constructed models. For example, the es- 
timated EC50 values of 23 and 36 are in nM and in mM ranges, respectively. This can 
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be explained by different topographical requirements for 23 and 36 , in contour maps 
(Figure 6.14). There are large yellow colored contours close to the flap regions of 
HIV-1 PR. Compound 36 shows the existence of sterically unfavorable fields (the ar- 
eas in which bulky groups are predicted to decrease binding). A part of cyclohexadi- 
ene and dihydropyridine groups of 36 fit these unfavorable regions; (right on the Fig- 
ure 6.14 (top)). However, sterically unfavorable yellow colored contour maps do not 
match with the subgroups of 23 (left on the Figure 6.14 (top)). 


Compound 

No. 

Measured 

pEC50 

3D QSAR 
CoMSIA 
Predicted 
pEC50 

Difference 

1 

7.0 

6.90 

0.10 

2 

5.3 

5.33 

-0.03 

3 

6.82 

6.69 

0.13 

4 

5.14 

5.28 

-0.14 

5 

6.31 

6.32 

-0.01 

6 

2.89 

2.80 

0.09 

7 

4.12 

4.24 

-0.12 

8 

3.64 

3.65 

-0.01 

9 

3.85 

3.70 

0.15 

10 

5.60 

5.55 

0.05 

11 

6.05 

5.99 

0.06 

12 

5.14 

5.28 

-0.14 

13 

5.66 

5.62 

0.04 

14 

5.20 

5.28 

-0.08 

15 

5.54 

5.58 

-0.04 

17 

3.86 

3.93 

-0.07 

18 

4.75 

4.81 

-0.06 

19 

4.14 

4.05 

0.09 

21 

6.60 

6.90 

-0.30 

22 

7.29 

7.34 

-0.05 

23 

8.70 

8.70 

0.00 

24 

6.15 

6.07 

0.08 

25 

3.33 

3.49 

-0.16 

26 

6.20 

6.24 

-0.04 

28 

6.18 

6.15 

0.03 
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29 

6.68 

6.31 

0.37 

30 

4.70 

4.62 

0.12 

31 

4.77 

4.92 

-0.15 

32 

5.50 

5.47 

0.03 

33 

6.55 

6.59 

-0.04 

35 

4.23 

4.19 

0.04 

36 

2.25 

2.18 

0.07 

37 

3.36 

3.36 

0.00 

38 

5.73 

5.83 

-0.10 

40 

3.06 

3.36 

-0.30 

41 

7.40 

7.38 

0.02 

42 

7.40 

7.43 

-0.03 

44 

5.61 

5.59 

0.02 

45 

4.13 

3.96 

0.17 

46 

7.00 

7.12 

-0.12 

47 

6.15 

6.07 

0.08 

49 

6.08 

6.07 

0.01 


Table 6.6 Measured and 3D QSAR/CoMSIA predicted binding affinities of com- 
pounds in training set. 


Compound 

Measured 

pEC50 

3D QSAR 
CoMSIA 
Predicted 
pEC50 

Difference 

16 

4.66 

5.39 

-0.73 

27 

5.82 

5.31 

0.51 

34 

4.04 

3.67 

0.37 

39 

4.61 

5.60 

-0.99 

43 

6.74 

7.68 

-0.94 

48 

5.22 

5.47 

-0.25 


Table 6.7 Measured and 3D QSAR/CoMSIA predicted binding affinities of com- 
pounds in test set. 


Furthermore, the subgroups of 23 fit perfectly with the sterically favorable areas 
which are shown with green colored contours (left on the Figure 6.14 (top)). Electro- 
static contour maps (shown with blue and red colored contour maps) are mainly ob- 
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served in the catalytic region of binding cavity. Red colored contours which show 
electronegative favored fields are in very close neighborhood with the -COOH groups 
of 23 . H-bond donor and H-bond acceptor contour maps have been shown in Figure 
6.14 (bottom) in the binding cavity of HIV-1 PR. H-bond acceptor and H-bond donor 
favored regions fit very well with the -COOH groups of 23 and disfavored regions are 
far from the subgroups of the 23 (left on the Figure 6.14 (bottom)). However, sub- 
groups of 36 fit mainly with the purple colored contours which are disfavored for the 
H-bond donor interactions (right on the Figure 6.14 (bottom)). 

Leapfrog de novo drug design software has been used to propose predicted novel 
fullerene HIV-1 PR inhibitors. Molecules 1 , 3 (biologically measured potent mole- 
cules, Table 6.1 ); 23 and 42 (computationally designed monoadduct and bisadduct 
fullerene derivatives that have best predicted binding energy with HIV-1 PR, Table 
6.4) were used as starting structures with allowing the modifications only for the sub- 
groups of fullerene derivatives in Leapfrog simulations. In addition to the de novo 
drug design, 3D QSAR/CoMSIA contour maps were also used to design new 
monoadducts and bisadducts [60]fullerene. More than 100 compounds have been de- 
signed to vary in polarity and contain various groups exerting electrostatic and steric 
interactions at different topographical requirements and their binding energies with 
HIV-1 PR have been computed using molecular docking studies (representative struc- 
tures and their computed binding energies were presented in Table 6.8). These are ex- 
pected to differ in their mode of action as it is indeed observed with the active site of 
the receptor. The binding interactions of 1 (high potent biologically evaluated 
fullerene derivative, Table 6.1) and 23 (computationally designed fullerene derivative 
with predicted very high potency, Table 6.4) at the active site of the HIV-1 PR have 
been compared in Figure 6.15. Compound 1 forms two hydrogen bonds between hy- 
drogen atom of -OH group of ligand and oxygen atoms of the -COO group of Asp25 
catalytic amino acid residue at chain B, together with oxygen atom of -OH group of 
ligand and hydrogen atom of backbone -NH group of Ala28 at chain B (left on the 
Figure 6.15). On the other hand, 23 forms a hydrogen bond network between -COOH 
groups of ligand and catalytic amino acid residues Asp25, Gly27, Asp29, Asp30 at 
chain A together with Asp25 and Gly27 at chain B (right on the Figure 6.15). The van 
der Waals interactions of these ligands with non-polar HIV-1 PR surface have been 
observed mainly at the flap part of the cavity. 
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It must be noted that this is a computational study, thus the easy feasibility of synthe- 
sizing any of the designed derivatives has not been thoroughly examined. 



Figure 6.14 (top) CoMSIA steric/electro static contour maps of template compound 
23 (template compound; has best binding affinity in training set, left on the figure) 
and compound 36 (has worst binding affinity in training set, right on the figure), (bot- 
tom) CoMSIA H-bond donor/H-bond acceptor contour maps of compounds 23 and 36 
(on the left and right of the figure, correspondingly). The individual contributions 
from the H-bond donor and H-bond acceptor favored and disfavored levels are fixed 
at 80% and 20%, respectively. The contours for H-bond donor favored fields have 
been shown in cyan color while its disfavored fields have been shown in purple color. 
H-bond acceptor favored fields have been shown in orange color while its disfavored 
fields have been shown in white color. 
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Table 6.8 Computationally designed fullerene derivatives using 3D QSAR contour 
maps and de novo drug design and their binding energies (the top binding score from 
FlexX molecular docking at HIV-1 PR). 
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Figure 6.15 The binding interactions of 1 and 23 at the active site of the HIV-1. 
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Chapter 7. Summary and Conclusions 
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In this study, bioactive conformations of CBs and [60]fullerene derivatives have been 
investigated. The major structural characteristics of these molecules are their amphi- 
philic properties as well as the existence of flexible and rigid pharmacophoric seg- 
ments. Their flexible segment constitutes a challenging field for conformational 
analysis exploring of putative bioactive conformations. In the first part, 3D QSAR, 
molecular docking, MC calculations, MD simulations and de novo drug design of CBs 
have been analyzed in order to propose novel and selective high affinity CB ana- 
logues for CB1 and CB2 receptors. Since conformational analysis of high active com- 
pound AMG3 was performed in different environments and derived conformations 
were used as template in the generation of 3D QSAR models, the connectivity be- 
tween the used environments, thus complexity level of calculations, and obtained sta- 
tistical results were discussed. In the second part, methodologies applied to CBs were 
performed to the monoadducts and bisadducts of [60]fullerene analogues in order to 
design higher inhibitory effect [60]fullerenes against HIV-1 PR enzyme as well as to 
analyze the connection between biological activity of fullerene derivatives and con- 
formational changes of catalytic and flap regions of the enzyme. 

7.1 CB Derivatives 

Cannabis sativa ( marijuana ) has long been used for its psychotropic and pharmacol- 
ogical effects. In the 1960’s, the correct stereochemistry of both A'-THC and CBD 
were elucidated, and A 9 -THC was identified as the principle psychoactive component 
of cannabis sativa. A wide range of pharmacological effects was described for A 9 - 
THC (e.g., analgesic, neuroprotective, anti-inflammatory, antiemitic, bronchodilatory, 
anticonvulsant). The structure elucidation of A 9 -THC allowed also the design of syn- 
thetic analogues (e.g., A S -THC) starting from 1970’s. A S -THC has a very similar 
pharmacological profile as A 9 -THC, however it is chemically more stable. Given psy- 
chotropic effects of CBs, many biological investigations employed brain and brain 
plasma membranes as study objects. Consensus data describing several key character- 
istics of CB action emerged that CB analogues elicit biological effects in a stereo and 
structurally selective manner. Their binding to brain plasma membrane is saturable, 
stereospecific, concordant with in vitro and in vivo bioresponses (e.g., analgesia) and 
nonrandom in select brain regions. Thus, these characteristics strongly implied that 
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CB pharmacology is receptor-mediated and search led to discovery of cloning of two 
GPCRs for CBs namely CB1 and CB2 receptors with a high sequence homology. 
However, hitherto no direct observation of a CB ligand bound to a CB receptor using 
X-ray crystallography has been reported. Thus, active sites of these receptors have 
been postulated from various in silico methods such as molecular docking using ho- 
mology model receptor systems and 3D QSAR studies. Alignment process is a critical 
part in 3D QSAR studies and affects statistical results. In addition, if derived 3D 
QSAR PLS analysis will be used in the steps of de novo drug design, it may affect 
also the proposed novel molecules. Thus, in the present study, the conformational 
analysis of template compound AMG3 which used in 3D QSAR studies have been 
analyzed at with and without receptor environments. 

MD simulations of AMG3 in lipid bilayer trajectory analysis showed that the dihedral 
angle defined between aromatic and dithiolane ring of the alkyl side chain of AMG3 
shows more resistance to be transformed to another torsional angle because the het- 
erocyclic part interacts with the ring A of the rigid segment and leads to restriction of 
the rotation. The T 3 -T 6 dihedral angles are transformed to optimal dihedral angle value 
of all trans conformation. Thus, the wrapped conformations are dynamically less 
probable in lipid bilayer than the linear conformations. The second dihedral angle 12 
at the alkyl chain shows very flexible character and it is the critical dihedral angle in 
lipid bilayer for producing different low energy conformations. Both of MM and QM 
geometry optimization calculations showed that, favored conformations of AMG3 
derived from MD in lipid bilayer form a perpendicular plane angle between tricyclic 
ring and flexible segments. The obtained results suggest that synthetic analogues in- 
corporating restraints (multiple bonds or rings) at different positions of alkyl chain 
may be of importance to be synthesized. Especially, such synthetic analogues are im- 
portant if they restrict the favorable angles and avoid the perturbation effects around 
the favorable dihedral angle. 

MD simulations have been also performed to the systems including AMG3 at the 
binding site of the CB1 and CB2 receptors merged with membrane bilayer in order to 
analyze the effect of amino acid residues at the binding cavities of CB receptors to the 
conformational properties of ligand. Although the CB1 and CB2 receptors exhibit a 
high sequence homology, there are certain behavior differences of AMG3 conformers 


174 



at the binding sites of receptors. One of the main differences between the MD simula- 
tions of ligand at the CB1 and CB2 receptors is the different behavior of the first di- 
hedral angle ii of the alkyl side chain of AMG3. In the CB1 receptor, there is a high 
propensity of ti to establish a gauche + conformation, however in the CB2 receptor; it 
prefers to have a trans conformation. It is well-known that, different conformational 
rearrangements of third and sixth TMs of GPCR determine the activation of CBs. In 
CB2 receptor, alkyl side chain of AMG3 conformers align parallelly in the ligand rec- 
ognition part of TM3, while in the CB1 receptor they align perpendicularly with the 
ligand recognition part of TM3 . This observation may help to understand the selectiv- 
ity of CB ligands for the CB1 and CB2 receptors. 

Three 3D QSAR models were generated using favored derived template conformers 
from (i) free ligand, (ii) molecular docking and (iii) MD simulations of ligand at bind- 
ing site of the receptor merged with lipid bilayer . Although there are some fine dif- 
ferences between the derived models, three general conclusions could be drawn from 
the characteristic contour maps: (i) The relative contributions of steric fields are larger 
than electrostatic fields; (ii) the orientation of the C3 -alkyl side chain plays an impor- 
tant role in determining the biological activity; (iii) because of the structural differ- 
ences of A S -THC and CBD derivatives at the cyclic ring segment, these groups have 
different pharmacophoric requirements for their receptors in these regions. While 
sterically unfavorable areas are located on the methyl or propenyl groups of CBD ana- 
logues, these unfavorable regions are located at the vicinity of the tricyclic segment of 
A S -THC analogues. Therefore, A S -THC analogues have higher binding affinities than 
their corresponding CBD analogues. Comparison of generated QSAR models were 
shown that when the complexity level of calculations increased (mimicking more ac- 
curately the real environment), it positively affects the obtained statistical results. The 
optimal QSAR PLS analysis which is obtained from the third generation of 3D QSAR 
model was used in the de novo drug design studies and these simulations provided 
novel compounds with enhanced predicted binding affinities. 

If the X-ray structure of a ligand-bound receptor is not available, homology models of 
the protein can be used to obtain the ligand binding cavities. The steroelectronic prop- 
erties of these cavities are directly related to the performed molecular model coordi- 
nates. In the present study, a homology modeling study based on the [!2-adrenergic 
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receptor for both CB1 and CB2 receptors was also performed and results confirmed 
the obtained binding pockets found in receptor models based on rhodopsin. 


7.2 Fullerene Derivatives 

Since the discovery of fullerene in 1985, derivative of fullerenes have been exten- 
sively investigated for biomedical applications (e.g., antibacterial, neuroprotective, 
antiviral and apoptosis). Antiviral activity of fullerenes has been supported by mo- 
lecular modeling and in vitro studies, which provided that the fullerene can be ac- 
commodated inside the catalytic cavity present in the HIV-1 PR. 

The binding affinity values of “first generation” fullerene inhibitors against HIV-1 PR 
were not significant (EC 50 ~10" 6 M). Thus, further structural investigation is required 
in order to propose new HIV-1 PR/fullerene complexes with better binding affinity. 
Novel monoadducts and bisadducts of [60]fullerene have been designed with a com- 
bination of 3D QSAR models and molecular docking studies. In order to use proper 
input coordinates of HIV-1 PR/fullerene derivative complex in the docking, MD 
simulations were employed for the ligand-free and the inhibitor-bound HIV-1 PR sys- 
tems. MD simulations contributed substantially in the understanding of the structural 
changes at the catalytic and flap regions for these two different systems. MD simula- 
tions results showed structural differences between the unbound and bound systems of 
the binding cavity of HIV-1 PR. In the fullerene inhibitor-bound forms, the flaps are 
pulled in toward the bottom of active site (the closed form), while, the structure for 
the free HIV-1 PR adopts a semi-open conformation with flaps shifted away from the 
dual Asp25 catalytic site. MD simulations have also shown that flap, catalytic and 
termino-lateral regions of HIV- 1 PR show more flexibility during the simulations. 

Obtained 3D QSAR models gave high relative contributions of steric fields which 
confirm the importance of the van der Waals interactions with non-polar HIV-1 PR 
surface in the activity of fullerenes. The contour maps from constructed 3D 
QSAR/CoMSIA models together with de novo drug design studies assisted to propose 
novel fullerene derivatives with better predicted potency which can aid synthetic 
chemists to initiate the synthesis of novel fullerene derivatives as HIV-1 PR inhibi- 
tors. 
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Zusammenfassung 


Der Einflub der Konformation eines bioaktiven Molekiils auf sein pharmakologischen 
Profil is seit langem bekannt. Nur die biologisch aktive Konformation eines 
Wirkstoffmolekiils ist in der Lage, eine Bindung mit der aktiven Stelle eines Rezep- 
tors einzugehen. In dieser Arbeit wird die Konformationsanalyse bioaktiver Ver- 
bindungen in verschiedenen Umgebungen diskutiert. Zwei verschiedene Molekiil- 
kategorien wurden untersucht: Cannabinoide (CB) sowie [60]Fullerenderivate. Die 
bedeutsamem strukturellen Merkmale dieser Molekiile sind zum einen, dab sie am- 
phiphil sind und zum zweiten, dab sie sowohl flexible als auch starre pharmakopho- 
rische Segmente besitzen. Insbesondere die flexiblen Teile stellen eine Herausfor- 
derung fiir die Konformationsanalyse moglicher bioaktiver Konformationen dar. 

Im Falle der CB-Verbindungen wurde eine Serie neuer Derivate von A s - 
tetrahydrocannabinol (A S -THC) und Cannabidiol (CBD) durch 3 -dimens ionale Quan- 
titative Struktur-Aktivitatsbeziehungsstudien (3D QSAR) untersucht, einmal mittels 
vergleichender Molekularfeld-Analyse (CoMFA), als auch mit Methoden, die auf 
vergeichenden molekularen Ahnlichkeitsindizes (CoMSIA) basieren. Die hochaktive 
Verbindung AMG3, (C- 1 ’-dithiolan-A -THC) wurde als Template-Molekirl aus dem 
benutzten Datensatz ausgewahlt. Die Bestimmung der potentiel bioaktiven Konforma- 
tion von AMG3 in Fosung erfolgte durch verschiedene molekulare Modellierung- 
stechniken: Monte Carlo (MC), Molkirldynamik (MD) sowie Gitterscananalysen. Das 
erhaltene Konformer wurde dann als Template weiter benutzt, und CB1 und CB2 
Pharmakophor-Modelle wurden entwickelt. Verfirgbare Homologiemodelle von CB1 
und CB2, die auf Rhodopsin basieren, ermoglichten die Konformationsanalyse von 
AMG3 an der Bindungsstelle der Rezeptoren. Die erhaltenenen energetisch begirn- 
stigten Konformere von AMG3 an der Bindungsstelle wurden mit den entsprechenden 
Konformationen in Fosung verglichen. Die stereoelektronischen Eigenschaften der 
Bindungskavitaten eines Rezeptormodells stehen in direktem Zusammenhang mit den 
benutzten molekularen Koordinaten des Modells. In der vorliegenden Arbeit wurde 
auch eine auf dem [i2-Adrenorezeptor basierende Homologie-Modellstudie firr die 
CB1 und CB2-Rezeptoren durchgefirhrt, und mit den Ergebnissen des mit dem auf 
dem Rhodopsin-Rezeptor basierten Homologiemodells verglichen. Ahnliche 
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Bindungsstellen in den CB1 und CB2-Rezeptoren wurden sowohl von den auf 
Rhodopsin basierten Modellen als auch von auf p2-Adrenorezeptor-basierten Model- 
len erzeugt. Die 3D QSAR Modelle wurden regeneriert mithilfe von potentiellen Kon- 
formeren von AMG3 an den Bindungsstellen der CB1- und CB2-Rezeptoren. Die von 
den 3D QSAR/CoMFA bzw: CoMSIA Pharmakophormodellen berechneten relativen 
Beitrage der sterischen und elektrostatischen Felder zeigten, dab die Bioaktivitat der 
Verbindungen hauptsachlich durch sterische Effekte bestimmt wird, obwohl elektro- 
statische Effekte auch eine Rolle spielen. Ein Vergleich entsprechender QSAR Mod- 
elle zeigte, dab die erhaltenen statistischen Resultate positiv beeinflubt wurden, wenn 
die Komplexitat der Rechnungen im Sinne einer realistischeren Modellierung des 
umgebenden Mediums erhoht wurde. Die optimale QSAR Analyse mit partieller mi- 
nimierter quadratischer Abweichung (PLS) wurde in Arbeiten zur de novo Wirkstoff- 
Entwicklung benutzt, und haben zur Entwicklung neuer Verbindungen mit 
verbesserten vorhergesagten Bindungsakti vitaten gefiihrt. 

Im Fall der Fullerenderivate wurde eine Serie von experimentell bekannten als auch 
von theoretisch entwickelten Mono- und Bisaddukten von [60]-Fullerenderivaten 
ausgewahlt und in Bezug auf die Bindungswechselwirkungen zwischen 
Fullerenbasierten Inhibitoren und Immunodefizienzvirus Typ I Endopeptidase (HIV-1 
PR) mithilfe von Dockingsstudien analysiert. MD-Simulationen des freien als auch 
des Inhibitor-gebundenen HIV-1 PR Systems erganzten die genannten Studien und 
lieferten geeignete Startstrukturen fur die Dockingssimulationen von HIV-1 PR. Die 
erhaltenen Ergebnisse zeigen eine unterschiedliche Orienterung der sogennannten [S- 
Haarnadel Laschen zwischen den beiden Systemem. In der Form mit angebundenem 
Fullereninhibitor werden die Laschen in Richtung des B odens des aktiven Bereichs 
hin gezogen (sogenannte geschlossene Form), wahrend die freie Form von HIV-1 PR 
eine halboffene Konformation bevorzugt. Die Strukturanalyse der katalytischen 
Segemente als auch der flexiblen Laschenregionen im Verlauf der Simulation von 
HIV-1 PR unterstutzt das Verstandnis sowohl der strukturellen Praferenzen dieser 
Regionen, als auch der von den Fullerenverbindungen eingenommenen 
Orientierungen innerhalb der aktiven Kavitat des Enzyms. Die Fullerenverbindung 
aus der Datenbank, die sich als die aktivste erwies, wurde anschliebend als Template 
ausgewahlt zur Erstellung von 3D QSAR-Modellen. Die damit erhaltenen 
Konturoberflachen sowie die Ergebnisse der PLS -Analyse wurden fur de novo 
Wirkstoffentwickungsstudien benutzt, mit dem Ziel, neue Fullerenderivate mit 
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hoherem Bindungsaktivitaten vorzuschlagen. Solche Molekiile konnen fiir den 
medizinischen Chemiker zur Synthese neuer HIV-1 PR Inhibitoren mit hohere 
Bioaktivitat von Interesse sein, und damit auf der Suche nach dringend benotigten 
neuen Anti-HIV Wirkstoffen von Bedeutung sein. 
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Appendix 


List of Important Abbreviations 

CB: Cannabinoid 

GPCR: G protein coupled receptors 
A S -THC: A 8 -tetrahydrocannabinol 
CBD: Cannabidiol 

3D QSAR: Three-dimensional quantitative structure-activity relationships 

CoMFA: Comparative molecular field analysis 

CoMSIA: Comparative molecular similarity analysis 

TM: Transmembrane 

MD: Molecular dynamics 

MC: Monte Carlo 

PLS: Partial least squares 

PCR: Principal components regression 

2D-NOESY: Nuclear Overhauser effect spectroscopy 

2D-ROESY : Rotating-frame Overhauser effect spectroscopy 

2D-COSY: Correlated spectroscopy 

r 2 : the squared correlation coefficient 

2 p 

r cv : Cross-validated r 
TSS: Total sum of squares 
ESS: Explained sum of squares 
RSS: Residual sum of squares 
MLR: Multiple linear regression 
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(i) The complete lists of residues consisting the binding site of the CB1 and CB2 re- 
ceptors: 


CB1: Ilel 19, Leul22, Serl23, Leul26, Alai 62, Aspl63, Leul65, Glyl66, Serl67, 
He 167, Phel70, Vall71, Tyrl72, Serl73, Phel74, Ilel75, Aspl76, Phel77, Hisl78, 
Vail 79, Serl85, Argl86, Asnl87, Vail 88, Phel89, Leul90, Phel91, Lysl92, 
Leu 193, Glyl94, Glyl95, Vail 96, Thrl97, Alal98, Serl99, Phe200, Thr201, Ala202, 
Ser203, Val204, Gly205, Ser206, Ile243, Ala244, Val246, Ile247, Ala248, Val249, 
Leu250, Pro251, Leu252, Tyr275, Thr283, Leu287, Val351, Leu352, Ile353, Ile354, 
Cys355, Trp356, Gly357, Pro358, Leu359, Leu360, Ala361, Ile362, Met363, Val364, 
Val367, Val378, Phe379, Ala380, Phe381, Cys382, Ser383, Met384, Leu385, 
Cys386, Leu387, Leu388, Asn389, Ser390, Thr391 and Val392. 

CB2: Phel06, Leul07, Leul08, Ilel 10, Glylll, Seri 12, Vail 13, Thrll4, Metll5, 
Thrl 16, Phell7, Thrll8, Alall9, Glyl22, Leul60, Serl61, Alal62, Leul63, Vall64, 
Serl65, Tyrl66, Leul67, Prol68, Leul69, Aspl89, Tyrl90, Leul91, Leul92, Serl93, 
Trpl94, Leu 196, Phel97, Ilel98, Leu201, Leu254, Cys257, Trp258, Phe259, Pro260, 
Val261, Leu262, Ala263, Leu264, Met265, Ala266, His267, Ser268, Leu269. 

(ii) Complete lists of residues used at the binding site of HIV -1 PR receptor: 

Chain A: Gln7, Arg8, Pro9, Leu23, Leu24, Asp25, Thr26, Gly27, Ala28, Asp29, 
Asp30, Thr31, Val32, Leu33, Met46, Ile47, Gly48, Gly49, Ile50, Gly51, Phe53, Ile54, 
Lys55, Val56, Pro79, Thr80, Pro81, Val82, Asn83, Ile84, Ile85. 

Chain B: Arg8, Ala22, Asp25, Thr26, Gly27, Ala28, Asp29, Asp30, Val32, Met46, 
Ile47, Gly48, Gly49, Ile50, Gly51, Gly52, Phe53, Ile54, Thr80, Pro81, Val82, Ile84, 
Ile85. 
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Figure Al. (top) Potential energy versus time plots throughout the MD simulations 
(include equilibration and simulation parts) for the membrane bound CB1 receptor 
systems (corresponding plot for the equilibration has been shown on the right for clar- 
ity). (bottom) Potential energy versus time plot throughout the MD simulations (in- 
clude equilibration and simulation parts) for membrane bound CB2 receptor systems 
(corresponding plot for the equilibration has been shown on the right for clarity). 
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Figure A2. Receptor backbone RMSD versus time plots throughout the MD simula- 
tions for membrane bound CB1 (left) and CB2 receptor (right) systems. 
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Figure A3. Potential energy versus time (left) and receptor backbone RMSD versus 
time (right) plots throughout the MD simulations of fullerene/HIV-1 PR systems. 
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Figure A4. Torsional angle screening throughout in lipid bilayer simulations for wrapped conformers D and E. 
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